CN112567387B

CN112567387B - Characterizing activity and encoding and decoding information in recurrent artificial neural networks

Info

Publication number: CN112567387B
Application number: CN201980053465.4A
Authority: CN
Inventors: H·马克莱姆; R·利维; K·P·赫斯贝尔瓦尔德
Original assignee: Inet Co ltd
Current assignee: Inet Co ltd
Priority date: 2018-06-11
Filing date: 2019-06-05
Publication date: 2025-08-26
Anticipated expiration: 2039-06-05
Also published as: CN112567388B; KR102497238B1; CN121031659A; CN112567388A; KR102465409B1; KR102526132B1; CN120805977A; EP3803708A1; CN112567389A; WO2019238483A1; KR102488042B1; CN112567390A; EP3803699A1; CN112585621A; CN112567387A; TW202001693A; KR20210008419A; EP3803705A1; KR20210008417A; EP3803707A1

Abstract

Methods, systems, and apparatus for characterizing activity in a recurrent artificial neural network and encoding and decoding information, including computer programs encoded on a computer storage medium. In one aspect, a method may include characterizing activity in an artificial neural network. The method is performed by a data processing device and may include identifying a cluster pattern of activity in the artificial neural network. The cluster pattern of activity may encompass a cavity.

Description

Characterizing activity in a recurrent artificial neural network and encoding and decoding information

Background

The present description relates to the characterization of activity in a recurrent artificial neural network (recurrent artificial neural network). The characterization of the activity may be used, for example, in the identification of decision moments (decision moments) and in encoding/decoding signals in scenarios such as transmission, encryption and data storage. It also relates to systems and techniques for encoding and decoding information, and using the encoded information in various scenarios. The encoded information may represent activity in a neural network (e.g., a recurrent neural network).

An artificial neural network is a device inspired by structural and functional aspects of a biological neural network. In particular, artificial neural networks use a system of interconnected formations, known as nodes, to simulate the information encoding and other processing capabilities of biological neuronal networks. The arrangement and strength of the connections between nodes in the artificial neural network determines the outcome of the information processing or information storage by the artificial neural network.

The neural network may be trained to produce a desired signal flow in the network and to achieve a desired information processing or information storage result. Typically, training the neural network will change the arrangement and/or strength of the connections between nodes during the learning phase. A neural network may be considered trained when it achieves sufficiently appropriate processing results for a given set of inputs.

Artificial neural networks may be used in a wide variety of different devices to perform nonlinear data processing and analysis. The non-linear data processing does not satisfy the superposition principle (superposition principle), i.e. the variable to be determined cannot be written as a linear sum of the individual components. Examples of scenarios where nonlinear data processing is useful include pattern and sequence recognition (PATTERN AND sequence recognition), speech processing, novelty detection and sequential decisions, complex system modeling, and systems and techniques in a wide variety of other scenarios.

Both encoding and decoding convert information from one form or representation to another. Different representations may provide different features that are more or less useful in different applications. For example, some forms or representations of information (e.g., natural language) may be easier for humans to understand. Other forms or representations may be smaller in size (e.g., "compressed") and easier to transport or store. Still other forms or representations may intentionally obscure the information content (e.g., the information may be cryptographically encoded).

Regardless of the particular application, the encoding or decoding process will typically follow a predefined set of rules or algorithms that establish correspondence between information in different forms or representations. For example, the encoding process that produces the binary code may assign roles or meanings to individual bits based on their positions in the binary sequence or vector.

Disclosure of Invention

This specification describes techniques related to characterization of activity in an artificial neural network.

For example, in one implementation, a method may include characterizing activity in an artificial neural network. The method is performed by a data processing apparatus and may include identifying a bolus pattern (clique pattern) of activity of the artificial neural network. The bolus mode of activity encloses a cavity (cavity).

This and other implementations can include one or more of the following features. The method may include defining a plurality of time windows during which the activity of the artificial neural network is responsive to an input into the artificial neural network. The bolus pattern of activity may be identified in each of the plurality of time windows. The method may include identifying a first time window within the plurality of time windows based on a distinguishable likelihood (likelihood) of the bolus pattern of activity occurring during the first time window. Identifying the clique pattern may include identifying active directed cliques (directed clique). Lower-dimensional directed cliques present in higher-dimensional directed cliques may be discarded or ignored.

The method may include classifying the clique patterns into categories (categories) and characterizing the activity according to a number of occurrences of the clique patterns in respective ones of the categories. Classifying the clique patterns may include classifying the clique patterns according to a number of points within each clique pattern. The method may include outputting a binary sequence of 0 and 1 from the recurrent artificial neural network. Each number in the sequence may represent whether there is a corresponding pattern of activity in the artificial neural network. The method may include structuring the artificial neural network by reading the numbers output from the artificial neural network and evolving (evolve) a structure of the artificial neural network. The structure of the artificial neural network may be evolved by iteratively changing the structure, characterizing a complexity of a pattern of activity in the changed structure, and using the characterization of the complexity of the pattern as an indication of whether the changed structure is desired.

The artificial neural network may be a recurrent artificial neural network. The method may include identifying a decision moment in the recurrent artificial neural network based on a determination of a complexity of a pattern of activity in the recurrent artificial neural network. The identifying of a decision moment may comprise determining a specific time (time) of an activity having a distinguishable complexity from other activities responsive to the input, and identifying the decision moment based on the specific time of the activity having a distinguishable complexity. The method may include inputting a data stream into the recurrent artificial neural network and identifying the bolus pattern of activity during the input of the data stream. The method may include evaluating whether the activity is responsive to the input into the artificial neural network. The evaluating may include evaluating that a relatively simple pattern of activity relatively soon after an input event is responsive to the input but a relatively complex pattern of activity relatively soon after the input event is not responsive to the input, and evaluating that a relatively complex pattern of activity relatively later after the input event is responsive to the input but a relatively simple pattern of activity relatively later after the input event is not responsive to the input.

In another implementation, a system may include one or more computers operable to perform operations. The operations may include characterizing activity in an artificial neural network, and identifying a bolus pattern of activity of the artificial neural network, wherein the bolus pattern of activity surrounds a cavity. The operations may include defining a plurality of time windows during which the activity of the artificial neural network is responsive to an input into the artificial neural network. The bolus pattern of activity may be identified in each of the plurality of time windows. The operations may include identifying a first time window within the plurality of time windows based on distinguishable likelihoods of the bolus pattern of activity occurring during the first time window. Identifying the clique pattern may include discarding or ignoring lower-dimensional directed cliques present in higher-dimensional directed cliques. The operations may include structuring the artificial neural network, including reading the numbers output from the artificial neural network and evolving a structure of the artificial neural network. The structure of the artificial neural network may be evolved by iteratively changing the structure, characterizing a complexity of a pattern of activity in the changed structure, and using the characterization of the complexity of the pattern as an indication of whether the changed structure is desired. The artificial neural network may be a recurrent artificial neural network. The operations may include identifying a decision time in the recurrent artificial neural network based on a determination of a complexity of a pattern of activity in the recurrent artificial neural network. The identifying of a decision time may include determining a particular time of an activity having a distinguishable complexity from other activities responsive to the input, and identifying the decision time based on the particular time of the activity having the distinguishable complexity. The operations may include inputting a data stream into the recurrent artificial neural network and identifying the bolus pattern of activity during the input of the data stream. The operations may include evaluating whether the activity is responsive to the input into the artificial neural network. The evaluating may include evaluating that a relatively simple pattern of activity relatively soon after an input event is responsive to the input but a relatively complex pattern of activity relatively soon after the input event is not responsive to the input, and evaluating that a relatively complex pattern of activity relatively later after the input event is responsive to the input but a relatively simple pattern of activity relatively later after the input event is not responsive to the input.

As another example, a method for identifying decision moments in a neural network includes determining a complexity of a pattern of activities in a recurrent artificial neural network, wherein the activities are responsive to an input into the recurrent artificial neural network, determining a particular time of an activity having a distinguishable complexity from other activities responsive to the input, and identifying the decision moments based on the particular time of the activity having a distinguishable complexity.

As another example, a method for characterizing activity in a recurrent artificial neural network includes identifying a predefined cluster pattern of activity of the recurrent artificial neural network. The method is performed by a data processing apparatus. As another example, a method may include outputting a binary sequence of 0 and 1 from a recurrent artificial neural network, wherein each number in the sequence represents whether a particular group of nodes in the recurrent artificial neural network exhibits a corresponding pattern of activity.

As another example, a method of structuring a recurrent artificial neural network may include characterizing a complexity of a pattern of activity that may occur in the recurrent artificial neural network, the recurrent artificial neural network including a structured set of nodes and links between the nodes, and evolving a structure of the recurrent artificial neural network to increase the complexity of the pattern of activity. This method of structuring may also be used, for example, as part of a method of training the recurrent artificial neural network.

Other embodiments of these aspects include corresponding systems, apparatus, and computer programs configured to perform the actions of the methods encoded on computer storage devices.

Particular embodiments of the subject matter described in this specification can be implemented to realize one or more of the following advantages. For example, conventional data processing devices, such as, for example, digital computers and other computers, are programmed to follow a predefined logic sequence when processing information. Thus, the time at which the results are implemented by the computer is relatively easy to identify. That is, the completion of the logic sequence embedded in the programming indicates when the information processing is complete and the computer has "reached a decision. The results may be maintained at the output of the computer's data processor in a relatively long-lived form by, for example, a memory device, a set of buffers, etc., and may be accessed for a variety of purposes.

Instead, decision moments in an artificial recurrent neural network may be identified based on characteristics of the dynamics of the neural network during information processing, as described herein. Rather than waiting for the artificial neural network to reach a predefined end of the logic sequence, decision moments in the artificial neural network may be identified based on characteristics of the functional state of the artificial neural network during information processing.

Furthermore, features that cycle the dynamics of artificial neural networks during information processing, including such features as activities commensurate with clique patterns and directed clique patterns, can be used in a wide variety of signaling operations (SIGNALLING OPERATION), including signaling, encoding, encryption, and storage. In particular, the characteristics of activity in the recurrent artificial neural network during information processing reflect the input and may be considered as a coded version of the input (i.e., the "output" of the recurrent artificial neural network during the coding process). These features may be transmitted, for example, to a remote receiver, which may decode the transmitted features to reconstruct the input or a portion of the input.

Further, in some cases, activity in different node groups of the recurrent artificial neural network (e.g., activity commensurate with the clique pattern and the directed clique pattern) may be represented as a binary sequence of 0's and 1's, each number indicating whether the activity is commensurate with the pattern. Since in some scenarios the activity may be the output of a recurrent artificial neural network, the output of the recurrent artificial neural network may be represented as a vector of binary digits and compatible with digital data processing.

Furthermore, in some cases, such characterization of the dynamics of the recurrent artificial neural network may be used prior to and/or during training to increase the likelihood of complex patterns of activity occurring during information processing. For example, before or during training, links between nodes in a recurrent neural network may be intentionally evolved to increase the complexity of the activity pattern. For example, links between nodes in a recurrent artificial neural network may be intentionally evolved to increase the likelihood of active and directed clique patterns occurring, for example, during information processing. This may reduce the time and effort required to train the recurrent artificial neural network.

As another example, such characterization of the dynamics of the recurrent artificial neural network may be used to determine the degree of completion in training of the recurrent neural network. For example, a recurrent artificial neural network that shows a particular type of ordering (e.g., clique mode and directed clique mode) in an activity may be considered to be more highly trained than a recurrent artificial neural network that does not show such ordering. Indeed, in some cases, the degree of training may be quantified by quantifying the degree of ordering of activities in the recurrent artificial neural network.

For example, a method for identifying decision moments in a neural network includes determining a complexity of a pattern of activities in a recurrent artificial neural network, wherein the activities are responsive to an input into the recurrent artificial neural network, determining a particular time of an activity having a distinguishable complexity from other activities responsive to the input, and identifying the decision moments based on the particular time of the activity having the distinguishable complexity.

As another example, a method for characterizing activity in a recurrent artificial neural network includes identifying a cluster pattern of activity of the recurrent artificial neural network. The method is performed by a data processing apparatus.

As another example, a method may include outputting a binary sequence of 0 and 1 from a recurrent artificial neural network, wherein each number in the sequence represents whether a particular group of nodes in the recurrent artificial neural network exhibits a corresponding pattern of activity.

As another example, a method of structuring a recurrent artificial neural network may include characterizing a complexity of a pattern of activity that may occur in the recurrent artificial neural network, the recurrent artificial neural network including a structured set of nodes and links between the nodes, and evolving a structure of the recurrent artificial neural network to increase the complexity of the pattern of activity. This method of structuring may also be used, for example, as part of a method of training a recurrent artificial neural network.

Furthermore, features that cycle the dynamics of artificial neural networks during information processing, including such features as activities commensurate with clique patterns and directed clique patterns, can be used in a wide variety of signaling operations, including signaling, encoding, encryption, and storage. In particular, the characteristics of activity in the recurrent artificial neural network during information processing reflect the input and may be considered as a coded version of the input (i.e., the "output" of the recurrent artificial neural network during the coding process). These features may be transmitted, for example, to a remote receiver, which may decode the transmitted features to reconstruct the input or a portion of the input.

As yet another example, in one implementation, an apparatus includes a neural network trained to generate an approximation of a first representation of a topology in a pattern of activity in a source neural network that occurs in response to a first input, generate an approximation of a second representation of a topology in a pattern of activity in the source neural network that occurs in response to a second input, and generate an approximation of a third representation of a topology in a pattern of activity in the source neural network that occurs in response to a third input in response to the first input.

This and other implementations can include one or more of the following features. The topology may all include two or more nodes in the source neural network and one or more edges (edges) between the nodes. The topology may include a simplex (simplice). The topology may enclose a cavity. Each of the first, second, and third representations may represent a topology that appears in the source neural network only at times during which the pattern of activity has a complexity that is distinguishable from the complexity of other activity responsive to respective ones of the inputs. The device may also include a processor coupled to receive the approximation of the representation produced by the neural network device and to process the received approximation. The processor may include a second neural network that has been trained to process representations generated by the neural network. Each of the first, second, and third representations may include multiple-valued (multi-valued), non-binary digits. Each of the first, second, and third representations may represent an occurrence of the topology without specifying where the pattern of activity occurs in the source neural network. The device may comprise a smart phone. The source neural network may be a recurrent neural network.

In another implementation, an apparatus includes a neural network coupled to input a representation of a topology in a pattern of activity occurring in a source neural network in response to a plurality of different inputs. The neural network is trained to process the representation and produce a response output.

This and other implementations can include one or more of the following features. The topologies may all include two or more nodes in the source neural network and one or more edges between the nodes. The topology may include a simplex. The representation of topology may represent topology that appears in the source neural network only at times during which the pattern of activity has a complexity that is distinguishable from the complexity of other activity responsive to respective ones of the inputs. The device may include a neural network trained to generate, in response to a plurality of different inputs, respective approximations of representations of topologies in patterns of activity occurring in the source neural network in response to the different inputs. The representation of the topology may include multiple valued, non-binary digits. The representation of topology may represent the occurrence of the topology without specifying where the pattern of activity occurs in the source neural network. The source neural network may be a recurrent neural network.

In another implementation, a method is implemented by a neural network device and includes inputting a representation of a topology in a pattern of activities in a source neural network, wherein the activities are responsive to inputs into the source neural network, processing the representation, and outputting a result of the processing of the representation. The processing is consistent with training the neural network to handle different such representations of topology in the pattern of activity in the source neural network.

This and other implementations can include one or more of the following features. The topologies may all include two or more nodes in the source neural network and one or more edges between the nodes. The topology may include a simplex. The topology may enclose a cavity. The representation of topology may represent topology that appears in the source neural network only at times during which the pattern of activity has a complexity that is distinguishable from the complexity of other activity responsive to respective ones of the inputs. The representation of the topology may include multiple valued, non-binary digits. The representation of topology may represent the occurrence of the topology without specifying where the pattern of activity occurs in the source neural network. The source neural network may be a recurrent neural network.

As yet another example, in one implementation, an apparatus includes a neural network coupled to input a representation of a topology in a pattern of activity occurring in a source neural network in response to a plurality of different inputs. The neural network is trained to process the representation and produce a response output.

This and other implementations can include one or more of the following features. The topologies all include two or more nodes in the source neural network and one or more edges between the nodes. The device may include an actuator coupled to receive the response output from the neural network and to act on a real or virtual environment, a sensor coupled to measure a characteristic of the environment, and a teacher module (teacher) configured to interpret the measurements (measurement) received from the sensor and provide rewards (reward) and/or regrets (regret) to the neural network. The topology may include a simplex. The topology may enclose a cavity. The representation of topology may represent topology that appears in the source neural network only at times during which the pattern of activity has a complexity that is distinguishable from the complexity of other activity responsive to respective ones of the inputs. The device may include a second neural network trained to generate, in response to a plurality of different inputs, respective approximations of the representations of topologies in patterns of activity occurring in the source neural network in response to the different inputs. Such a device may also include an actuator coupled to receive the response output from the neural network and to act on a real or virtual environment, and a sensor coupled to measure a characteristic of the environment. The second neural network may be trained to produce the respective approximations at least partially in response to the measured characteristics of the environment. The device may also include a teacher module configured to interpret the measurements received from the sensors and provide rewards and/or regrets to the neural network. The representation of the topology may include multiple valued, non-binary digits. The representation of topology may represent the occurrence of the topology without specifying where the pattern of activity occurs in the source neural network. The device may be a smart phone. The source neural network may be a recurrent neural network.

In another implementation, a method implemented by one or more data processing devices may include receiving a training set (TRAINING SET) including a plurality of representations of topologies in patterns of activity in a source neural network, and training the neural network using the representations as input to the neural network or as a target answer vector. The activity is responsive to an input into the source neural network.

This and other implementations can include one or more of the following features. The topologies all include two or more nodes in the source neural network and one or more edges between the nodes. The training set may include a plurality of input vectors, each of the input vectors corresponding to a respective one of the representations. Training the neural network may include training the neural network using each of the plurality of representations as a target answer vector. Training the neural network may include training the neural network using each of the plurality of representations as an input. The training set may include a plurality of rewards or regrets. Training the neural network may include reinforcement learning. The topology may include a simplex. The representation of topology may represent topology that appears in the source neural network only at times during which the pattern of activity has a complexity that is distinguishable from the complexity of other activity responsive to respective ones of the inputs. The representation of the topology may include multiple valued, non-binary digits. The representation of topology may represent the occurrence of the topology without specifying where the pattern of activity occurs in the source neural network. The source neural network may be a recurrent neural network.

The details of one or more implementations described in the specification are set forth in the accompanying drawings and the description below. Other features, aspects, and advantages of the subject matter will become apparent from the description, the drawings, and the claims.

Drawings

Fig. 1 is a schematic illustration of the structure of a recurrent artificial neural network device.

Fig. 2 and 3 are schematic illustrations of the function of the artificial neural network device cycling through different time windows.

FIG. 4 is a flow chart of a process for identifying decision moments in a recurring artificial neural network based on characterization of activity in the network.

FIG. 5 is a schematic illustration of a pattern of activity that may be identified and used to identify decision time instants in a recurrent artificial neural network.

FIG. 6 is a schematic illustration of a pattern of activity that may be identified and used to identify decision time instants in a recurrent artificial neural network.

FIG. 7 is a schematic illustration of a pattern of activity that may be identified and used to identify decision time instants in a recurrent artificial neural network.

Fig. 8 is a schematic illustration of a data table that may be used in the determination of complexity or degree of ordering in an activity pattern in a recurrent artificial neural network device.

Fig. 9 is a schematic illustration of the determination of a particular time of an activity pattern with distinguishable complexity.

Fig. 10 is a flow chart of a process for encoding a signal using a recurrent artificial neural network based on characterization of activity in the network.

Fig. 11 is a flow chart of a process for decoding a signal using a recurring artificial neural network based on characterization of activity in the network.

Fig. 12, 13 and 14 are schematic illustrations of binary forms or representations of topologies.

Fig. 15 and 16 schematically illustrate one example of how the presence or absence of features corresponding to different bits are not independent of each other.

Fig. 17, 18, 19, 20 are schematic illustrations of representations of the occurrence of a topology in an activity using a neural network in four different classification systems.

Fig. 21, 22 are schematic illustrations of edge devices including a local artificial neural network (local artificial neural network) that can be trained using representations of occurrences of topologies corresponding to activity in a source neural network.

FIG. 23 is a schematic illustration of a system in which a local neural network may be trained using representations of occurrences of topologies corresponding to activity in the source neural network.

Fig. 24, 25, 26, 27 are schematic illustrations of representations of the occurrence of a topology in an activity using a neural network in four different systems.

FIG. 28 is a schematic illustration of a system 0 that includes an artificial neural network that can be trained using representations of occurrences of topologies corresponding to activity in the source neural network.

Like reference symbols in the various drawings indicate like elements.

Detailed Description

Fig. 1 is a schematic illustration of the structure of a recurrent artificial neural network device 100. The recurrent artificial neural network device 100 is a device that simulates the information encoding and other processing capabilities of a biological neural network using a system of interconnected nodes. The recurrent artificial neural network device 100 may be implemented in hardware, software, or a combination thereof.

Examples of the recurrent artificial neural network device 100 include a plurality of nodes 101, 102, a.i., 107 interconnected by a plurality of structural links 110. Nodes 101, 102 the following layers 107 are discrete information processing constructs similar to neurons in a biological network. Nodes 101, 102, and 107 typically process one or more input signals received through one or more of links 110 to produce one or more output signals that are output through one or more of links 110. In some implementations, for example, nodes 101, 102, the..the..107 may be artificial neurons, the artificial neuron weights and sums a plurality of input signals, passes the sum through one or more nonlinear activation functions (activation function), and outputs one or more output signals.

Nodes 101, 102, and/or 107 may operate as accumulators. For example, nodes 101, 102, and/or 107 may operate according to an integrated excitation (integration-and-fire) model in which one or more signals are accumulated in a first node until a threshold is reached. After reaching the threshold, the first node fires by transmitting the output signal along one or more of the links 110 to the connected second node. In turn, the second nodes 101, 102, the second nodes, 107 accumulate the received signals and if a threshold is reached, the second nodes 101, 102, the second nodes transmit yet another output signal to one further connected node.

The fabric link 110 is a connection capable of transmitting signals between the nodes 101, 102. For convenience, all structural links 110 are considered herein to be the same bi-directional links that transmit signals from each first node of nodes 101, 102, and/or 107 to each second node of nodes 101, 102, and/or 107 in the same manner as signals are transmitted from the second nodes to the first nodes. However, this is not necessarily the case. For example, a portion or all of the fabric link 110 may be a unidirectional link that transmits signals from a first one of the nodes 101, 102, &..the first node to a second one of the nodes 101, 102, &..the second node, 107 without transmitting signals from the second node to the first node.

As another example, in some implementations, the structural links 110 may have a variety of characteristics other than or in addition to directionality. For example, in some implementations, different fabric links 110 may carry signals of different magnitudes-resulting in different interconnect strengths between respective ones of nodes 101, 102,..once. As another example, different structural links 110 may carry different types of signals (e.g., inhibit) and/or excitatory (excitatory) signals. Indeed, in some implementations, the structural links 110 may mimic links between somatic cells in a biological system and reflect at least a portion of the vast morphology, chemistry, and other diversity of such links.

In the illustrated implementation, the recurrent artificial neural network device 100 is a patch network (clique network) (or subnetwork) because each node 101, 102, &., 107 is connected to each other node 101, 102, &., 107. This is not necessarily the case. Instead, in some implementations, each node 101, 102, &..the 107 may be connected to an appropriate subset of nodes 101, 102, &..the 107 (via the same link or various links, as the case may be).

For clarity of illustration, the recurrent artificial neural network device 100 is illustrated as having only seven nodes. Typically, a real world neural network device will include a significantly greater number of nodes. For example, in some implementations, a neural network device may include hundreds of thousands, millions, or even billions of nodes. Thus, the recurrent neural network device 100 may be a small portion (i.e., a sub-network) of a larger recurrent artificial neural network.

In a biological neural network device, the accumulation and signaling process requires a time lapse in the real world. For example, somatic cells of neurons integrate inputs received over time, and signal transmission from neuron to neuron requires time, which is determined by, for example, the signal transmission speed and the nature and length of the links between neurons. Thus, the state of the biological neural network device is dynamic and changes over time.

In an artificial recurrent neural network device, time is artificial and represented using a mathematical construct (MATHEMATICAL CONSTRUCT). For example, signals that do not require a real world time lapse for transmission from node to node may be represented in artificial units that are generally independent of the real world time lapse, as measured by a computer clock cycle or other means. However, the state of the artificial recurrent neural network device may be described as "dynamic" in that it changes with respect to these artificial units.

Note that for convenience, these artificial units are referred to herein as "time" units. However, it should be understood that these units are artificial and generally do not correspond to the real world time lapse.

Fig. 2 and 3 are schematic illustrations of the functionality of the recurrent artificial neural network device 100 in different time windows. Because the state of the device 100 is dynamic, signaling activity occurring within a window may be used to represent the functionality of the device 100. Such functional instantiations typically show activity in only a small portion of the link 110. In particular, not every link 110 is illustrated as actively (actively) contributing to the functionality of the device 100 in these illustrations, since not every link 110 typically transmits a signal within a particular window.

In the illustration of fig. 2 and 3, active link 110 is illustrated as a relatively thick solid line connecting a pair of nodes 101, 102. In contrast, inactive (active) links 110 are illustrated as dashed lines. This is for illustration only. In other words, the structural connection formed by link 110 exists whether link 110 is active or not. However, this formal hierarchy (formalism) highlights the activity and functionality of the device 100.

In addition to schematically illustrating the existence of an activity along a link, the direction of the activity is also schematically illustrated. In particular, the relatively thick solid line illustrating the active link 110 also includes arrows representing the direction of signal transmission along the link during the associated window. In general, the direction of signal transmission within a single window does not decisively limit the link to a unidirectional link with indicated directionality. Conversely, in a first functional instantiation for a first time window, a link may be active in a first direction. In a second functional illustration for a second window, the links may be active in the opposite direction. However, in some cases, such as, for example, in a recurrent artificial neural network device 100 that includes only unidirectional links, the directionality of the signal transmission will decisively indicate the directionality of the links.

In a feed-forward neural network device, information is moved only in a single direction (i.e., forward) to a node output layer at the end of the network. The propagation of the feed-forward neural network device through the network to the output layer indicates that a "decision" has been made and that the information processing is complete.

In contrast, in a recurrent neural network, the connections between nodes form loops, and the activity of the network progresses dynamically (progress) without readily identifiable decisions. For example, even in a three-node recurrent neural network, a first node may transmit a signal to a second node, and in response, the second node may transmit a signal to a third node. In response, the third node may transmit a signal back to the first node. The signal received by the first node may be responsive, at least in part, to signals transmitted from that same node.

Schematic functional illustration figures 2 and 3 illustrate this in a network that is only slightly larger than a three-node recurrent neural network. The functional illustration shown in fig. 2 may illustrate activity within a first window, and fig. 3 may illustrate activity within an immediately following second window. As shown, the set of signaling activities appears to originate in node 104 and progress in a generally clockwise direction through device 100 during the first window. Within the second window, at least some of the signaling activity generally appears to return to node 104. Even in such simplistic illustration, the signaling does not proceed in a manner that produces a clearly identifiable output or end.

When considering a recurrent neural network of, for example, thousands of nodes or more, it is recognized that signal propagation can occur through a large number of paths and that these signals lack clearly identifiable "output" locations or times. Although by designing the network it is possible to return to a stationary state where only background or even no signaling activity occurs, the stationary state itself does not indicate the result of the information processing. Regardless of the input, the recurrent neural network always returns to a stationary state. Thus, the "output" or result of the information processing is encoded in the activity that occurs in the recurrent neural network in response to a particular input.

Fig. 4 is a flow chart of a process 400 for identifying decision moments in a recurring artificial neural network based on characterization of activity in the network. The decision moment is a point in time at which activity in the recurrent artificial neural network indicates the result of the information processing by the network in response to the input. Process 400 may be performed by a system of one or more data processing apparatus performing operations according to logic of one or more sets of machine-readable instructions. For example, process 400 may be performed by the same system executing one or more computers used to implement software for a recurrent artificial neural network used in process 400.

At 405, the system performing process 400 receives a notification that a signal has been input into the recurrent artificial neural network. In some cases, the input of the signal is a discrete injection event, wherein, for example, information is injected into one or more nodes and/or one or more links of the neural network. In other cases, the input of the signal is an information stream injected into one or more nodes and/or links of the neural network over a period of time. The notification indicates that the artificial neural network is actively processing information and is not, for example, in a stationary state. In some cases, the notification is received from the neural network itself, such as, for example, when the neural network exits an identifiable stationary state.

At 410, the system executing process 400 divides the response activity in the network into a set of windows. In the case where the injection is a discrete event, the window may subdivide the time between injection and return to a stationary state into a plurality of periods during which the activity shows a variable complexity. In the case where the injection is a stream of information, the duration of the injection (and optionally the time to return to a stationary state after the injection is completed) may be subdivided into windows during which the activity shows a variable complexity. Various methods of determining the complexity of an activity are discussed further below.

In some implementations, the windows all have the same duration, but this is not necessarily the case. Conversely, in some implementations, the windows may have different durations. For example, in some implementations, the duration may increase as the time since the discrete injection event has occurred increases.

In some implementations, the window may be a continuous series of individual windows. In other implementations, the windows overlap in time such that one window begins before the end of the previous window. In some cases, the window may be a moving window that moves in time.

In some implementations, different durations of the window are defined for different determinations of complexity of the activity. For example, for an active mode defining an activity occurring between a relatively large number of nodes, the window may have a relatively longer duration than a window defined for an active mode defining an activity occurring between a relatively small number of nodes. For example, in the scenario of an active pattern 500 (FIG. 5), a window defined for identifying an activity commensurate with pattern 530 may be longer than a window defined for identifying an activity commensurate with pattern 505.

At 415, the system executing process 400 identifies patterns in the network in activity within different windows. As discussed further below, patterns in an activity may be identified by treating a functional graph (function graph) as a topological space with nodes as points. In some implementations, the identified activity pattern is a clique, e.g., a directed clique, in a functional graph of the network.

At 420, the system executing process 400 determines the complexity of the activity pattern within the different windows. Complexity may be a measure of the likelihood that an active ordered pattern will occur within a window. Thus, the randomly occurring activity pattern will be relatively simple. On the other hand, the active modes showing a non-random order are relatively complex. For example, in some implementations, the complexity of the active mode may be measured using, for example, a simplex count (simplex count) or Betty number (Betti number) of the active mode.

At 425, the system executing process 400 determines a particular time of the activity pattern with distinguishable complexity. The particular activity patterns may be distinguishable based on complexity of the upward deviation (e.g., from a fixed or variable baseline) or the downward deviation. In other words, a particular time may be determined that indicates a particular high level or a particular low level of activity pattern of a non-random order in activity.

For example, where the signal input is a discrete injection event, a deviation, e.g., a deviation from a stable baseline or a deviation from a curve that is characteristic of the neural network's average response to a wide variety of different discrete injection events, may be used to determine a particular time of the distinguishable complex activity pattern. As another example, where information is entered in a streaming form, large changes in complexity during streaming may be used to determine specific times of distinguishable complex activity patterns.

At 430, the system executing process 400 schedules reading of the output from the neural network based on the particular times of the differentially complex activity patterns. For example, in some implementations, the output of the neural network may be read at the same time that the distinguishable complex activity patterns occur. In implementations where the complexity deviation indicates a relatively high non-random order in activity, the observed activity pattern itself may also be regarded as the output of the recurrent artificial neural network.

Fig. 5 is an illustration of a pattern 500 that may be identified and used to identify activity at decision moments in a recurring artificial neural network. For example, pattern 500 may be identified at 415 in process 400 (fig. 4).

Pattern 500 is an illustration of activity in a recurrent artificial neural network. During application of pattern 500, the functional graph is considered a topological space with nodes as points. Activities in nodes and links that are compatible with schema 500 may be identified as ordered, regardless of the identity of the particular node and/or link that participated in the activity. For example, the first pattern 505 may represent activity between nodes 101, 104, 105 in fig. 2, with point 0 in pattern 505 as node 104, point 1 as node 105, and point 2 as node 101. As another example, the first pattern 505 may also represent activity between nodes 104, 105, 106 in fig. 3, with point 0 as node 106, point 1 as node 104, and point 2 as node 105 in pattern 505. The order of the activities in the directed bolus is also specified. For example, in mode 505, the activity between point 1 and point 2 occurs after the activity between point 0 and point 1.

In the illustrated implementation, schema 500 is all directed cliques or directed simplex. In such a mode, the activity originates from the source node transmitting signals to each other node in the mode. In pattern 500, such a source node is designated as point 0, while the other nodes are designated as points 1, 2. Further, in a directed graph or simplex, one of the nodes acts as a sink (sink) and receives signals transmitted from each other node in the pattern. In pattern 500, such sink nodes are designated as the highest numbered point in the pattern. For example, in mode 505, the sink node is designated as point 2. In mode 510, the sink node is designated as point 3. In mode 515, the sink node is designated as point 3, and so on. Thus, the activities represented by pattern 500 are ordered in a distinguishable manner.

Each of the patterns 500 has a different number of points and reflects ordered activity in a different number of nodes. For example, pattern 505 is a two-dimensional simplex and reflects activity in three nodes, pattern 510 is a three-dimensional simplex and reflects activity in four nodes, and so on. As the number of points in a pattern increases, so does the degree of ordering and complexity of the activity. For example, for a large set of nodes that have some degree of random activity within the window, some of the activity may be fortuitously compatible with pattern 505. However, random activity will progressively become increasingly less likely to be commensurate with the corresponding ones of the patterns 510, 515, 520. The presence of activity commensurate with pattern 530 indicates a relatively high degree of ordering and complexity in the activity as compared to the presence of activity commensurate with pattern 505.

As previously discussed, in some implementations, windows of different durations may be defined for different determinations of complexity of an activity. For example, when an activity commensurate with pattern 530 is to be identified, a longer duration window may be used than when an activity commensurate with pattern 505 is to be identified.

Fig. 6 is an illustration of a pattern 600 that may be identified and used to identify activity at decision moments in a recurring artificial neural network. For example, pattern 600 may be identified at 415 in process 400 (fig. 4).

Like pattern 500, pattern 600 is an illustration of activity in a recurrent artificial neural network. However, pattern 600 deviates from the strict ordering of pattern 500 because pattern 600 is not all directed cliques or directed simplex. In particular, modes 605, 610 have lower directionality than mode 515. In practice, pattern 605 lacks sink nodes altogether. However, the patterns 605, 610 indicate a degree of orderly activity that exceeds that expected by random chance, and may be used to determine the complexity of activity in the recurrent artificial neural network.

Fig. 7 is an illustration of a pattern 700 that may be identified and used to identify activity at decision time instants in a recurrent artificial neural network. For example, pattern 700 may be identified at 415 in process 400 (fig. 4).

Pattern 700 is a group of directed cliques or directed simplex having the same dimensions (i.e., having the same number of points), which defines a pattern that involves more points than a single clique or simplex and encloses a cavity within the group of directed simplex.

By way of example, pattern 705 includes six different three-point, two-dimensional patterns 505 that together define a level 2 class of homomorphism, while pattern 710 includes eight different three-point, two-dimensional patterns 505 that together define a level 2 class of homomorphism. Each of the three-point, two-dimensional patterns 505 of patterns 705, 710 may be considered to encompass a respective cavity. The nth Betti number associated with the directed graph provides a count of such classics in the topology representation.

The activity illustrated by a pattern such as pattern 700 illustrates a relatively high degree of ordering of activities in the network that are unlikely to occur by random chance. Pattern 700 may be used to characterize the complexity of the activity.

In some implementations, only some patterns of activity are identified and/or some portion of the identified patterns of activity are discarded or otherwise ignored during the identification of the decision time. For example, referring to FIG. 5, activities commensurate with the five-point, four-dimensional simplex mode 515 inherently include activities commensurate with the four-point, three-dimensional, and three-point, two-dimensional simplex modes 510, 505. For example, points 0, 2, 3, 4 and points 1, 2, 3, 4 in the four-dimensional simplex mode 515 of FIG. 5 are all commensurate with the three-dimensional simplex mode 510. In some implementations, patterns that contain fewer points-and thus have lower dimensions-may be discarded or otherwise ignored during the identification of the decision time.

As another example, only some patterns of activity need to be identified. For example, in some implementations, a pattern having only odd points (3, 5, 7,) or even dimensions (2, 4, 6,) is used in the identification of decision time instants.

The degree of complexity or ordering in the activity pattern in the recurrent artificial neural network device within different windows may be determined in a variety of different ways. Fig. 8 is a schematic illustration of a data table 800 that may be used in such a determination. The data table 800 may be used to determine the complexity of an activity pattern, alone or in combination with other activities. For example, data table 800 may be used at 420 in process 400 (fig. 4).

In more detail, table 800 includes a count of the number of patterns that occur during window "N", where the count of the number of activities that match patterns of different dimensions are presented in different rows. For example, in the illustrated example, row 805 includes a count of the number of occurrences of activity that matches one or more three-point, two-dimensional patterns (i.e., "2032"), while row 810 includes a count of the number of occurrences of activity that matches one or more four-point, three-dimensional patterns (i.e., "877"). Since the occurrence of patterns indicates that the activity has a non-random order, the count of numbers also provides a generalized characterization of the overall complexity of the activity pattern. A table similar to table 800 may be formed for each window defined, for example, at 410 in process 400 (fig. 4).

Although table 800 includes separate rows and separate entries for each type of activity pattern, this is not necessarily the case. For example, one of a plurality of counts (e.g., counts of simpler patterns) may be omitted from the table 800 and from the determination of complexity. As another example, some implementations, a single row or entry may include a count of occurrences of multiple active modes.

Although fig. 8 presents a number count in table 800, this is not necessarily the case. For example, the number count may be presented as a vector (e.g., <2032,877,133,66,48, a.,) a.. Regardless of how the count is presented, in some implementations the count may be expressed in binary and may be compatible with the digital data processing infrastructure.

In some implementations, the number counts of occurrences of the pattern may be weighted or combined to determine the degree or complexity of the ordering, for example, at 420 in process 400 (fig. 4). For example, the Euler feature (Eular characteristic) may provide an approximation of the complexity of the activity and is given by the following equation:

s ₀-S₁+S₂-S₃ +. Equation 1

Where S _n is the number of occurrences of the pattern of n points (i.e., the pattern of dimension n-1). The pattern may be, for example, a directed bolus pattern 500 (fig. 5).

As another example of how the number of occurrences of a pattern may be weighted to determine the degree or complexity of the ordering, in some implementations the pattern occurrences may be weighted based on the weights of the active links. In more detail, as previously discussed, the strength of the connections between nodes in the artificial neural network may vary, for example, due to the liveness of the connections during training. The occurrence of a pattern of activity along a set of relatively strong links may be weighted differently than the occurrence of the same pattern of activity along a set of relatively weak links. For example, in some implementations, the sum of the weights of the active links may be used to weight the occurrence.

In some implementations, the euler characteristics or other metrics of complexity may be normalized by the total number of patterns that match within a particular window and/or the total number of patterns that a given network may form in view of its structure. An example of normalization with respect to the total number of modes that the network may form is given below in equations 2, 3.

In some implementations, occurrences of higher dimensional patterns involving a greater number of nodes may be weighted more heavily than occurrences of lower dimensional patterns involving a smaller number of nodes. For example, the probability of forming a directed mass decreases rapidly with increasing dimension. In particular, to form an n-clique from n+1 nodes, it is necessary that the (n+1) n/2 edges are all correctly oriented. This probability may be reflected in the weighting.

In some implementations, both the dimensions and directionality of the pattern may be used to weight the occurrence of the pattern and determine the complexity of the activity. For example, referring to fig. 6, the occurrence of the five-point, four-dimensional pattern 515 may be weighted more heavily than the occurrence of the five-point, four-dimensional pattern 605, 610, based on differences in the directionality of these patterns.

One example of using both directionality and dimensionality of a pattern to determine the degree of ordering or complexity of an activity may be given by the following equation:

Where S _x ^active indicates the number of active occurrences of the pattern of n points and ERN is a calculation for an equivalent random network (i.e., a network with the same number of nodes connected randomly). Further, SC is given by the following equation:

Where S _x ^silent indicates the number of occurrences of the pattern of n points when the recurrent artificial neural network is silent and can be considered to represent the total number of patterns that the network is likely to form. In equations 2, 3, the pattern may be, for example, a directed bolus pattern 500 (fig. 5).

Fig. 9 is a schematic illustration of the determination of a particular time of an activity pattern with distinguishable complexity. The determination illustrated in fig. 9 may be performed in isolation or in combination with other activities. For example, the determination may be performed at 425 in process 400 (fig. 4).

Fig. 9 includes a graph 905 and a graph 910. Graph 905 illustrates the occurrence of a pattern as a function of time along the x-axis. In particular, each occurrence is schematically illustrated as a vertical line 906, 907, 908, 909. The occurrence of each row may be an instance of a class that is actively matching the corresponding pattern or patterns. For example, the occurrence of the top row may be an instance of the active match pattern 505 (FIG. 5), the occurrence of the second row may be an instance of the active match pattern 510 (FIG. 5), the occurrence of the third row may be an instance of the active match pattern 515 (FIG. 5), and so on.

The graph 905 also includes dashed rectangles 915, 920, 925 that schematically depict different time windows when the active mode has distinguishable complexity. As shown, during the windows depicted by the dashed rectangles 915, 920, 925, the likelihood of activity matching in the recurrent artificial neural network is higher for patterns indicative of complexity than outside those windows.

Graph 910 illustrates the complexity associated with these occurrences as a function of time along the x-axis. The graph 910 includes a first peak 930 of complexity that coincides with the window depicted by the dashed rectangle 915 and a second peak 935 of complexity that coincides with the window depicted by the dashed rectangles 920, 925. As shown, the complexity illustrated by peaks 930, 925 is distinguishable from the complexity of baseline level 940, which may be considered complexity.

In some implementations, the time at which the output of the recurrent artificial neural network will be read coincides with the occurrence of an active mode with distinguishable complexity. For example, in the illustrative scenario of fig. 9, the output of the recurrent artificial neural network may be read at peaks 930, 925, i.e., during the windows depicted by dashed rectangles 915, 920, 925.

The recognition of distinguishable levels of complexity in the recurrent artificial neural network is particularly beneficial when the input is a data stream. Examples of data streams include, for example, video or audio data. Although a data stream has a start, it is often desirable to process information in the data stream that has no predefined relationship with the start of the data stream. By way of example, the neural network may perform object recognition, such as, for example, recognizing a cyclist in the vicinity of an automobile. Such a neural network should be able to identify cyclists regardless of when those cyclists are present in the video stream, i.e., regardless of the time since the start of the video. Continuing with this example, when a data stream is input into the object recognition neural network, any pattern of activity in the neural network will typically exhibit a low or static level of complexity. These low or static levels of complexity are displayed regardless of the continuous (or near continuous) input of streaming data into the neural network device. However, when an object of interest appears in the video stream, the complexity of the activity will become distinguishable and indicate the time at which the object was identified in the video stream. Thus, a particular time of a distinguishable level of complexity of the activity may also serve as a yes/no output as to whether the data in the data stream meets certain criteria.

In some implementations, by having an activity pattern of distinguishable complexity, not only is given a particular time of the output of the recurrent artificial neural network, but also the content of the output of the recurrent artificial neural network. In particular, the identity and activity of nodes participating in an activity commensurate with the activity pattern may be considered the output of the recurrent artificial neural network. Thus, the identified activity pattern may exemplify the result of processing through the neural network, as well as the particular time at which this decision will be read.

The content of the decision may be expressed in a variety of different forms. For example, in some implementations and as discussed in further detail below, the contents of the decision may be expressed as binary vectors or matrices of 1 and 0. Each number may indicate, for example, whether an active mode exists for a predefined group of nodes and/or a predefined duration. In such implementations, the content of the decision is expressed in binary and may be compatible with conventional digital data processing infrastructure.

Fig. 10 is a flow chart of a process 1000 for encoding a signal using a recurrent artificial neural network based on characterization of activity in the network. The signal may be encoded in a variety of different scenarios, such as, for example, transmission, encryption, and data storage. Process 1000 may be performed by a system of one or more data processing apparatus performing operations according to logic of one or more sets of machine-readable instructions. For example, process 1000 may be performed by the same system executing one or more computers used to implement software for a recurrent artificial neural network used in process 1000. In some examples, process 1000 may be performed by the same data processing apparatus that performs process 400. In some examples, process 1000 may be performed by an encoder, such as in a signaling system or an encoder of a data storage system.

At 1005, the system performing process 1000 inputs a signal into a recurrent artificial neural network. In some cases, the input of the signal is a discrete injection event. In other cases, the input signal is streamed into a recurrent artificial neural network.

At 1010, the system executing process 1000 identifies one or more decision moments in a recurring artificial neural network. For example, the system may identify one or more decision moments by performing process 400 (fig. 4).

At 1015, the system executing process 1000 reads the output of the recurrent artificial neural network. As discussed above, in some implementations, the content of the output of the recurrent artificial neural network is activity in the neural network that matches the pattern used to identify the decision point.

In some implementations, a separate "reader node" may be added to the neural network to identify the occurrence of a particular pattern of activity at a particular set of nodes and thus to read the output of the recurrent artificial neural network at 1015. The reader node may fire if and only if the activity at a particular set of nodes meets a particular time (and possibly amplitude) criterion. For example, to read the occurrence of pattern 505 (FIG. 5) at nodes 104, 105, 106 (FIG. 2, FIG. 3), a reader node may be connected to nodes 104, 105, 106 (or link 110 therebetween). The reader node itself will become active only when a pattern of activity involving the nodes 104, 105, 106 (or their links) occurs.

The use of such a reader node would eliminate the need to define a time window for the recurrent artificial neural network as a whole. In particular, individual reader nodes may be connected to different nodes and/or multiple nodes (or links between them). Individual reader nodes may be configured with customized responses (e.g., different decay times in the integrated excitation pattern) to identify different activity patterns. At 1020, the system executing process 1000 transmits or stores the output of the recurrent artificial neural network. The particular action performed at 1020 may reflect the scenario in which process 1000 is being used. For example, in a scenario where secure or compressed communication is desired, a system executing process 1000 may transmit the output of a recurrent neural network to a receiver that may access the same or a similar recurrent neural network. As another example, in a scenario where secure or compressed data storage is desired, a system executing process 1000 may record the output of the recurrent neural network in one or more machine-readable data storage devices for later access.

In some implementations, the complete output of the recurrent neural network is not transmitted or stored. For example, in implementations where the content of the output of the recurrent neural network is an activity in the neural network that matches a pattern indicative of complexity in the activity, only activities that match relatively complex or higher dimensional activities may be transmitted or stored. By way of example, referring to pattern 500 (fig. 5), in some implementations, only the activity of matching patterns 515, 520, 525, and 530 is transmitted or stored, while the activity of matching patterns 505, 510 is ignored or discarded. In this way, lossy processing allows for a reduction in the amount of data transmitted or stored at the expense of the integrity of the information being encoded.

Fig. 11 is a flow chart of a process 1100 for decoding a signal using a recurring artificial neural network based on characterization of activity in the network. The signal may be decoded in a variety of different scenarios, such as, for example, signal reception, decryption, and reading data from storage. Process 1100 may be performed by a system of one or more data processing apparatus performing operations according to logic of one or more sets of machine-readable instructions. For example, process 1100 may be performed by the same system executing one or more computers used to implement software for a recurrent artificial neural network used in process 1100. In some examples, process 1100 may be performed by the same data processing apparatus that performs process 400 and/or process 1000. In some examples, process 1100 may be performed by a decoder in a signal receiving system or a decoder of a data storage system, for example.

At 1105, a system executing process 1100 receives at least a portion of an output of a recurrent artificial neural network. The particular action performed at 1105 may reflect the scenario in which process 1100 is being used. For example, a system executing process 1000 may receive a transmission signal comprising the output of a recurrent artificial neural network or read a machine-readable data storage device storing the output of the recurrent artificial neural network.

At 1110, the system performing process 1100 reconstructs an input of a recurrent artificial neural network from the received output. Reconstruction can be performed in a variety of different ways. For example, in some implementations, a second artificial neural network (recurrent or non-recurrent) may be trained to reconstruct an input into the recurrent neural network from the output received at 1105.

As another example, in some implementations, a decoder that has been trained using machine learning (including but not limited to deep learning) may be trained to reconstruct an input into a recurrent neural network from the output received at 1105.

As yet another example, in some implementations, inputs into the same or to similar recurrent artificial neural networks may be iteratively permuted (permute) until the output of the recurrent artificial neural network matches to some extent the output received at 1105.

In some implementations, the process 1100 may include receiving a user input specifying the extent to which the input is to be reconstructed, and in response, adjusting the reconstruction accordingly at 1110. For example, the user input may specify that a complete reconstruction is not required. In response, a system-adjusted reconstruction of process 1100 is performed. For example, in implementations where the content of the output of the recurrent neural network is an activity in the neural network that matches a pattern indicative of complexity in the activity, only the output that characterizes the activity that matches a relatively complex or higher dimensional activity will be used to reconstruct the input. By way of example, referring to pattern 500 (fig. 5), in some implementations, only the activity of matching patterns 515, 520, 525, and 530 may be used to reconstruct the input, while the activity of matching patterns 505, 510 may be ignored or discarded. In this way, a lossy reconstruction can be performed under selected circumstances.

In some implementations, the processes 1000, 1100 may be used for peer-to-peer (peer-to-peer) encrypted communications. In particular, both the transmitter (i.e., encoder) and the receiver (i.e., decoder) may be provided with the same recurrent artificial neural network. There are several ways in which a shared recurrent artificial neural network can be customized to ensure that a third party cannot reverse engineer it and decrypt the signal, including:

Structure of cyclic artificial neural network

Functional settings of the recurrent artificial neural network, including node states and edge weights,

Size (or dimension) of the pattern, and

A small fraction of the patterns in each dimension.

These parameters may be considered as multiple layers that together ensure transmission security. Furthermore, in some implementations, the decision time point may be used as a key to decrypt the signal.

Although the processes 1000, 1100 are presented in terms of encoding and decoding a single recurrent artificial neural network, the processes 1000, 1100 may also be applied in systems and processes that rely on multiple recurrent artificial neural networks. These recurrent artificial neural networks may be run in parallel or in series.

As one example of series operation, the output of a first cyclic artificial neural network may be used as an input to a second cyclic artificial neural network. The resulting output of the second round robin artificial neural network is a twice encoded (or twice encrypted) version of the input into the first round robin artificial neural network. Such a tandem arrangement of the recurrent artificial neural network may be useful where different parties have different levels of access to the information, for example, in a medical records system where patient identity information may be inaccessible to the party that will use and have access to the remainder of the medical record.

As one example of parallel operation, the same information may be input into a plurality of different recurrent artificial neural networks. The different outputs of these neural networks may be used, for example, to ensure that the input can be reconstructed with high fidelity.

While a number of implementations have been described, various modifications may be made. For example, while application generally means that activity in the recurrent artificial neural network should match a pattern that indicates ordering, this is not necessarily the case. Conversely, in some implementations, activity in the recurrent artificial neural network may be commensurate with the pattern without having to display activity matching the pattern. For example, a recurrent neural network will show an increased likelihood of activity that will match a pattern may be considered a non-random ordering of activity.

As yet another example, in some implementations, different pattern groups may be customized for use in characterizing activity in different recurrent artificial neural networks. The patterns may be customized, for example, according to the effectiveness of the patterns in characterizing the activities of different recurrent artificial neural networks. Efficacy may be quantified, for example, based on the size of a table or vector representing the occurrence counts of the different patterns.

As yet another example, in some implementations, the pattern used to characterize activity in the recurrent artificial neural network may consider the strength of the connections between nodes. In other words, the patterns previously described herein handle all signaling activities between two nodes in a binary manner (i.e., activity presence or absence). This is not necessarily the case. Conversely, in some implementations, commensurate with the pattern may require that the activity of a connection having a certain level or strength be considered to be indicative of an orderly complexity in the activity of the recurrent artificial neural network.

As yet another example, the content of the output of the recurrent artificial neural network may include activity patterns that occur outside a time window during which activity in the neural network has a distinguishable level of complexity. For example, the output of the recurrent artificial neural network read at 1015 and transmitted or stored at 1020 (fig. 10) may include information encoding an activity pattern, for example, that occurs outside of the dashed rectangles 915, 920, 925 in the graph 905 (fig. 9). By way of example, the output of the recurrent artificial neural network can characterize only the highest dimensional patterns of activity, regardless of when those patterns of activity occur. As another example, the output of the recurrent artificial neural network can characterize only the patterns of activity surrounding the cavity, regardless of when those patterns of activity occur.

Fig. 12, 13 and 14 are schematic illustrations of binary forms or representations 1200 of topologies such as, for example, patterns of activity in a neural network. The topologies illustrated in fig. 12, 13 and 14 all include the same information, i.e., an indication of the presence or absence of a feature in the graph. The feature may be, for example, an activity in a neural network device. In some implementations, the activity is identified based on or during a time period in which the activity in the neural network has a distinguishable complexity from other activities responsive to the input.

As shown, binary representation 1200 includes bits 1205, 1207, 1211, 1293, 1294, 1297 and an additional, arbitrary number of bits (represented by ellipses ".."). For the purpose of teaching, bits 1205, 1207, 1211, 1293, 1294, 1297 are illustrated as discrete rectangular shapes that are filled or unfilled to indicate binary values of the bits. In the schematic illustration, the representation 1200 appears to be a one-dimensional vector of bits (fig. 12, 13) or a two-dimensional matrix of bits (fig. 14). However, representation 1200 differs from other ordered sets of vectors, matrices, or bits in that the same information may be encoded regardless of the order of the bits-i.e., regardless of the location of individual bits within the set.

For example, in some implementations, each individual bit 1205, 1207, 1211, 1293, 1294 1297. There is no matter where the feature is in the chart. By way of example, referring to fig. 2, a bit, such as bit 1207, may indicate the existence of a topological feature commensurate with pattern 505 (fig. 5), regardless of whether the activity occurs between nodes 104, 105, 101 or between nodes 105, 101, 102. Thus, while each individual bit 1205, 1207, 1211, 1293, 1294, 1297. But the position of the feature in the chart need not be encoded, for example, by the corresponding position of the bit in the representation 1200. In other words, in some implementations, the representation 1200 may only provide a homogenous topology reconstruction of the graph.

Furthermore, in other implementations, it is possible that the locations of the individual bits 1205, 1207, 1211, 1293, 1294, 1297. In these implementations, a source graph (source graph) can be reconstructed using the representation 1200. However, such coding does not necessarily exist.

Whereas bits can represent the presence or absence of a topological feature regardless of its position in the graph, in FIG. 1, at the beginning of representation 1200, bit 1205 occurs before bit 1207, and bit 1207 occurs before bit 1211. In contrast, in fig. 2 and 3, the order of bits 1205, 1207, and 1211 in representation 1200, and the positions of bits 1205, 1207, and 1211 relative to other bits in representation 1200, have changed. However, binary representation 1200 remains the same-the rule set or algorithm defining the process used to encode the information in binary representation 1200 remains unchanged. The positions of the bits in representation 1200 are uncorrelated as long as the correspondence between the bits and the features is known.

In more detail, each bit 1205, 1207, 1211, 1293, 1294, 1297. A graph is a set of nodes and a set of edges between the nodes. The node may correspond to an object. Examples of objects may include, for example, artificial neurons in a neural network, individuals in a social network, and the like. Edges may correspond to some relationship between objects. Examples of relationships include, for example, structural connections or activities along the connections. In the context of neural networks, artificial neurons may be associated by structural connections between neurons or by the transmission of information along structural connections. In the context of a social network, individuals may be associated through "friends" or other relationship connections or through the transmission of information (e.g., posts) along such connections. Thus, an edge may characterize a relatively long-standing structural feature of a set of nodes or a relatively short-lived activity occurring within a defined time frame. Furthermore, edges may be directional or bi-directional. The directed edges indicate the directionality of the relationship between the objects. For example, the transmission of information from a first neuron to a second neuron may be represented by a directed edge representing the direction of transmission. As another example, in a social network, a relationship connection may indicate that a second user will receive information from a first user, rather than the first user will receive information from the second user. In topological terms, a graph can be expressed as a set of unit intervals [0,1], where 0 and 1 are identified by the corresponding nodes connected by edges.

The features whose presence or absence is indicated by bits 1205, 1207, 1211, 1293, 1294, 1297 may be, for example, a node, a set of nodes, a set of edges, and/or additional hierarchically more complex features (e.g., a set of nodes of a set of nodes). Bits 1205, 1207, 1211, 1293, 1294, 1297 generally represent the presence or absence of features at different hierarchical levels. For example, bit 1205 may represent the presence or absence of a node, while bit 1205 may represent the presence or absence of a group of nodes.

In some implementations, bits 1205, 1207, 1211, 1293, 1294, 1297 may represent features in the chart that have some characteristics at a threshold level. For example, bits 1205, 1207, 1211, 1293, 1294, 1297 may indicate not only that there is activity in a set of edges, but that this activity is weighted above or below a threshold level. The weights may, for example, embody training of the neural network device for a particular purpose or may be inherent characteristics of the edge.

Fig. 5, 6, and 8 above illustrate features whose presence or absence may be represented by bits 1205, 1207, 1211, 1293, 1294, 1297.

The directed simplex in the collection 500, 600, 700 treats the functional graph or the structural graph as a topological space with nodes as points. Structures or activities involving one or more nodes and links commensurate with the simplex in the collection 500, 600, 700 may be represented in bits regardless of the identity of the particular node and/or link participating in the activity.

In some implementations, only some patterns of structures or activities are identified and/or some portion of the identified patterns of structures or activities are discarded or otherwise ignored. For example, referring to FIG. 5, a structure or activity commensurate with the five-point, four-dimensional simplex mode 515 inherently includes a structure or activity commensurate with the four-point, three-dimensional, and three-point, two-dimensional simplex modes 510, 505. For example, points 0, 2, 3, 4 and points 1,2, 3, 4 in the four-dimensional simplex mode 515 of FIG. 5 are all commensurate with the three-dimensional simplex mode 510. In some implementations, simplex modes that contain fewer points-and thus have lower dimensions-may be discarded or otherwise ignored.

As another example, only some patterns of structure or activity need to be identified. For example, in some implementations, only modes with odd points (3, 5, 7,) or even dimensions (2, 4,6,) are used.

Returning to fig. 12, 13, 14, the features whose presence or absence is indicated by bits 1205, 1207, 1211, 1293, 1294, 1297. By way of explanation, if bits 1205, 1207, 1211, 1293, 1294, 1297 represent the presence or absence of a zero-dimensional simplex, each reflecting the presence or activity of a single node, then bits 1205, 1207, 1211, 1293, 1294, 1297 are independent of each other. However, if bits 1205, 1207, 1211, 1293, 1294, 1297 represent the presence or absence of a higher-dimensional simplex that each reflects the presence or activity of multiple nodes, the information encoded by the presence or absence of each individual feature may not be independent of the presence or absence of other features.

Fig. 15 schematically illustrates one example of how the presence or absence of features corresponding to different bits are not independent of each other. In particular, sub-graph 1500 is illustrated that includes four nodes 1505, 1510, 1515, 1520 and six directed edges 1525, 1530, 1535, 1540, 1545, 1550. In particular, edge 1525 points from node 1525 to node 1510, edge 1530 points from node 1515 to node 1505, edge 1535 points from node 1520 to node 1505, edge 1540 points from node 1520 to node 1510, edge 1545 points from node 1515 to node 1510, and edge 1550 points from node 1515 to node 1520.

A single bit in representation 1200 (e.g., filled bit 1207 in fig. 12, 13, 14) may indicate the presence of a directed three-dimensional simplex. For example, such bits may indicate the presence of a three-dimensional simplex formed by nodes 1505, 1510, 1515, 1520 and edges 1525, 1530, 1535, 1540, 1545, 1550. The second bit in representation 1200 (e.g., filled bit 1293 in fig. 12, 13, 14) may indicate the presence of a directed two-dimensional simplex. For example, such bits may indicate the presence of a two-dimensional simplex formed by nodes 1515, 1505, 1510 and edges 1525, 1530, 1545. In this simple example, the information encoded by bit 1293 is fully redundant along with the information encoded by bit 1207.

Note that the information encoded by bit 1293 may also be redundant along with the information encoded by yet another bit. For example, the information encoded by bit 1293 will be redundant along with both the third and fourth bits indicating the presence of an additional directed two-dimensional simplex. Examples of these simplex are formed by nodes 1515, 1520, 1510 and edges 1540, 1545, 1550 and nodes 1520, 1505, 1510 and edges 1525, 1535, 1540.

Fig. 16 schematically illustrates another example of how the presence or absence of features corresponding to different bits are not independent of each other. In particular, sub-graph 1600 is illustrated that includes four nodes 1605, 1610, 1615, 1620 and five directed edges 1625, 1630, 1635, 1640, 1645. Nodes 1505, 1510, 1515, 1520 and edges 1625, 1630, 1635, 1640, 1645 generally correspond to nodes 1505, 1510, 1515, 1520 and edges 1525, 1530, 1535, 1540, 1545 in sub-graph 1500 (FIG. 15). However, in contrast to sub-graph 1500 in which nodes 1515, 1520 are connected by edge 1550, nodes 1615, 1620 are not connected by edge.

A single bit in representation 1200 (e.g., unfilled bit 1205 in fig. 12, 13, 14) may indicate the absence of a directed three-dimensional simplex (such as, for example, a directed three-dimensional simplex containing nodes 1605, 1610, 1615, 1620). The second bit in representation 1200 (e.g., filled bit 1293 in fig. 12, 13, 14) may indicate the presence of a two-dimensional simplex. An exemplary directed two-dimensional simplex is formed by nodes 1615, 1605, 1610 and edges 1625, 1630, 1645. This combination of filled bits 1293 and unfilled bits 1205 provides information indicating the presence or absence of other features (and the state of other bits) that may or may not be present in the representation 1200. In particular, the combination of the absence of the directed three-dimensional simplex and the presence of the directed two-dimensional simplex indicates that at least one edge is absent:

a) Possible directed two-dimensional simplex or formed by nodes 1615, 1620, 1610

B) A possible directed two-dimensional simplex formed by nodes 1620, 1605, 1610.

Thus, the state of the bit representing the presence or absence of any of these possible simplex is not independent of the state of bits 1205, 1293.

Although these examples have been discussed in terms of features having different numbers of nodes and hierarchical relationships, this is not necessarily the case. For example, it is possible to include a representation 1200 that corresponds only to a set of bits, e.g., the presence or absence of a three-dimensional simplex.

The use of individual bits to indicate the presence or absence of a feature in a chart yields certain characteristics. For example, the encoding of the information is fault tolerant and provides "graceful degradation (graceful degradation)" of the encoded information. In particular, the loss of a particular bit (or group of bits) may increase the uncertainty as to the presence or absence of a feature. However, it will still be possible to evaluate the likelihood of the presence or absence of a feature from other bits indicating the presence or absence of neighboring features.

Also, as the number of bits increases, the certainty as to the presence or absence of a feature increases.

As another example, as discussed above, the ordering or arrangement of bits is independent of the isomorphic reconstruction of the chart represented by the bits. All that is required is a known correspondence between bits and a particular node/structure in the graph.

In some implementations, patterns of activity in the neural network may be encoded in the representation 1200 (fig. 12, 13, and 14). In general, the pattern of activity in a neural network is a result of many features of the neural network, such as, for example, structural connections between nodes of the neural network, weights between nodes, and a number of possible other parameters. For example, in some implementations, the neural network may have been trained prior to encoding of the active patterns in the representation 1200.

However, regardless of whether the neural network is untrained or trained, for a given input, the active response pattern may be considered a "representation" or "abstraction" of that input in the neural network. Thus, while the representation 1200 may appear to be a set of direct occurrences (straightforward-appearing) of (in some cases, binary) numbers, each of the numbers may encode a relationship or correspondence between a particular input and related activity in the neural network.

Fig. 17, 18, 19, 20 are schematic illustrations of representations of occurrences of a topology in an activity using a neural network in four different classification systems 1700, 1800, 1900, 2000. The classification systems 1700, 1800 each classify a representation of patterns of activity in a neural network as part of the classification of the input. The classification systems 1900, 2000 each classify an approximation of the representation of the pattern of activity in the neural network as part of the classification of the input. In the classification systems 1700, 1800, the pattern of activity represented occurs in a source neural network device 1705 that is part of the classification systems 1700, 1800 and is read from the source neural network device 1705. In contrast, in classification systems 1900, 2000, the pattern of activity that is approximately represented occurs in a source neural network device that is not part of classification systems 1700, 1800. However, approximations of representations of those patterns of activity are read from approximators 1905 that are part of the classification systems 1900, 2000.

Turning in more detail to fig. 17, the classification system 1700 includes a source neural network 1705 and a linear classifier 1710. The source neural network 1705 is a neural network device configured to receive input and present a representation of the occurrence of a topology in the activity in the source neural network 1705. In the illustrated implementation, the source neural network 1705 includes an input layer 1715 that receives input. However, this is not necessarily the case. For example, in some implementations, some or all of the inputs may be injected into different layers and/or edges or nodes throughout the source neural network 1705.

The source neural network 1705 may be any of a wide variety of different types of neural networks. Typically, the source neural network 1705 is a recurrent neural network, such as, for example, a recurrent neural network that mimics a biological system. In some cases, the source neural network 1705 may simulate the extent of morphological, chemical, and other features of a biological system. Typically, the source neural network 1705 is implemented on one or more computing devices (e.g., supercomputers) that have a relatively high level of computing performance. In such cases, the classification system 1700 will typically be a decentralized system in which the remote classifier 1710 is in communication with the source neural network 1705, e.g., via a data communication network.

In some implementations, the source neural network 1705 may be untrained and the represented activity may be an intrinsic activity of the source neural network 1705. In other implementations, the source neural network 1705 may be trained and the represented activity may embody this training.

The representation read from the source neural network 1705 may be a representation such as representation 1200 (fig. 12, 13, 14). The representation may be read from the source neural network 1705 in a variety of ways. For example, in the illustrated example, the source neural network 1705 includes a "reader node" that reads a pattern of activity between other nodes in the source neural network 1705. In other implementations, the activity in the source neural network 1705 is read by a data processing component that is programmed to monitor a relatively highly ordered pattern of activity of the source neural network 1705. In other implementations, the source neural network 1705 may include an output layer from which the representation 1200 may be read, for example, when the source neural network 1705 is implemented as a feed-forward neural network.

The linear classifier 1710 is a device that classifies objects, i.e., representations of patterns of activity in the source neural network 1705, based on a linear combination of features of the objects. The linear classifier 1710 includes an input 1720 and an output 1725. Input 1720 is coupled to receive a representation of the pattern of activity in source neural network 1705. In other words, the representation of the pattern of activity in the source neural network 1705 is a feature vector representing features of the input into the source neural network 1705 that are used by the linear classifier 1710 to classify the input. The linear classifier 1710 can receive a representation of the patterns of activity in the source neural network 1705 in a variety of ways. For example, the representation of the pattern of activity may be received as discrete events or as a continuous stream over a real-time or non-real-time communication channel.

The output 1725 is coupled to output the classification result from the linear classifier 1710. In the illustrated implementation, the output 1725 is schematically illustrated as a parallel port with multiple channels. This is not necessarily the case. For example, output 1725 may output the classification result via a serial port or a port with combined parallel and serial capabilities.

In some implementations, the linear classifier 1710 may be implemented on one or more computing devices with relatively limited computing capabilities. For example, the linear classifier 1710 may be implemented on a personal computer or mobile computing device such as a smart phone or tablet computer.

In fig. 18, the classification system 1800 includes a source neural network 1705 and a neural network classifier 1810. The neural network classifier 1810 is a neural network device that classifies objects, i.e., representations of patterns of activity in the source neural network 1705, based on a nonlinear combination of features of the objects. In the illustrated implementation, the neural network classifier 1810 is a feed-forward network that includes an input layer 1820 and an output layer 1825. As with the linear classifier 1710, the neural network classifier 1810 may receive a representation of the pattern of activity in the source neural network 1705 in a variety of ways. For example, the representation of the pattern of activity may be received as discrete events or as a continuous stream over a real-time or non-real-time communication channel.

In some implementations, the neural network classifier 1810 may perform the inference on one or more computing devices with relatively limited computing capabilities. For example, the neural network classifier 1810 may be implemented on a personal computer or mobile computing device such as a smart phone or tablet computer, for example, in a neural processing unit of such a device. Like classification system 1700, classification system 1800 will typically be a decentralized system in which remote neural network classifier 1810 communicates with source neural network 1705, such as via a data communication network.

In some implementations, the neural network classifier 1810 may be, for example, a deep neural network, such as a convolutional neural network including a convolutional layer, a pooled layer, and a fully-connected layer. The convolution layer may generate a feature map, for example, using a linear convolution filter and/or a nonlinear activation function. The pooling layer reduces the number of parameters and controls the overfitting. The computations performed by the different layers in the image classifier 1820 may be defined in different ways in different implementations of the image classifier 1820.

In fig. 19, classification system 1900 includes a source approximator 1905 and a linear classifier 1710. As discussed further below, the source approximator 1905 is a relatively simple neural network trained to receive input-at the input layer 1915 or elsewhere-and output vectors that approximate a representation of the topology that appears in the pattern of activity in the relatively complex neural network. For example, the source approximator 1905 may be trained to approximate a recurrent source neural network, such as, for example, a recurrent neural network that mimics a biological system and includes the extent of morphological, chemical, and other features of the biological system. In the illustrated implementation, the source approximator 1905 includes an input layer 1915 and an output layer 1920. The input layer 1915 may be coupled to receive input data. The output layer 1920 is coupled to output an approximation of the representation of activity within the neural network device for receipt by the input 1720 of the linear classifier. For example, the output layer 1920 may output an approximation 1200' of the representation 1200 (fig. 12, 13, 14). The representation 1200 schematically illustrated in fig. 17 and 18 is identical to the approximation 1200' of the representation 1200 schematically illustrated in fig. 19 and 20. This is for convenience only. Generally, the approximation 1200' will differ from the representation 1200 in at least some respects. Despite these differences, the linear classifier 1710 may still classify the approximation 1200'.

In general, the source approximator 1905 may perform inferences on one or more computing devices having relatively limited computing capabilities. For example, the source approximator 1905 may be implemented on a personal computer or mobile computing device such as a smart phone or tablet computer, for example, in a neural processing unit of such a device. Typically and in contrast to the sorting systems 1700, 1800, the sorting system 1900 will typically be housed within a single housing, for example, where the source approximator 1905 and the linear sorter 1710 are implemented on the same data processing device or on data processing devices coupled by a hardwired connection.

In fig. 20, classification system 2000 includes a source approximator 1905 and a neural network classifier 1810. The output layer 1920 of the source approximator 1905 is coupled to output an approximation 1200' of the representation of activity within the neural network device for receipt by the input 1820 of the neural network classifier 1810. Despite any differences between the approximation 1200 'and the representation 1200, the neural network classifier 1810 may still classify the approximation 1200'. Typically and like the sorting system 1900, the sorting system 1900 will typically be housed within a single housing, for example, where the source approximator 1905 and the neural network sorter 1810 are implemented on the same data processing device or on data processing devices coupled by a hardwired connection.

Fig. 21 is a schematic illustration of an edge device 2100 that includes a local artificial neural network that can be trained using representations of occurrences of topologies corresponding to activity in a source neural network. In this scenario, the local artificial neural network may be, for example, an artificial neural network that is executed entirely on one or more local processors that do not require a communication network to exchange data. Typically, the local processors will be connected by a hard-wired connection. In some examples, the local processor may be housed within a single housing, such as a single personal computer or a single hand-held, mobile device. In some instances, the local processor may be controlled and accessed by a single individual or a limited number of individuals. In fact, by training (e.g., using supervised learning or reinforcement learning techniques) a simpler and/or less highly trained but more unique second neural network using representations of the occurrence of topologies in a more complex source neural network, even individuals with limited computational resources and a limited number of training samples can train the neural network as desired. Memory requirements and computational complexity during training are reduced and resources like battery life are saved.

In the illustrated implementation, the edge device 2100 is schematically illustrated as a security camera device that includes an optical imaging system 2110, image processing electronics 2115, a source approximator 2120, a representation classifier 2125, and a communication controller and interface 2130.

The optical imaging system 2110 may comprise, for example, one or more lenses (or even pinholes) and a CCD device. The image processing electronics 2115 can read the output of the optical imaging system 2110 and can generally perform basic image processing functions. The communication controller and interface 2130 is a device configured to control the flow of information to the device 2100 and from the device 2100. Among the operations that the communication controller and interface 2130 may perform are to transmit images of interest to other devices and to receive training information from other devices, as discussed further below. Thus, the communication controller and interface 2130 may include both a data transmitter and receiver that may communicate through, for example, data port 2135. The data port 2135 may be a wired port, a wireless port, an optical port, etc.

The source approximator 2120 is a relatively simple neural network trained to output a vector that approximates a representation of the topology that appears in the pattern of activity in a relatively complex neural network. For example, the source approximator 2120 may be trained to approximate a recurrent source neural network, such as, for example, a recurrent neural network that mimics a biological system and includes the extent of morphological, chemical, and other features of the biological system.

The representation classifier 2125 is a linear classifier or neural network classifier that is coupled to receive an approximation of a representation of the pattern of activity in the source neural network from the source approximator 2120 and output classification results. The representation classifier 2125 may be, for example, a deep neural network, such as a convolutional neural network including a convolutional layer, a pooled layer, and a fully-connected layer. The convolution layer may generate a feature map, for example, using a linear convolution filter and/or a nonlinear activation function. The pooling layer reduces the number of parameters and controls the overfitting. The computations performed by the different layers in the representation classifier 2125 may be defined in different ways in different implementations of the representation classifier 2125.

In some implementations, in operation, the optical imaging system 2110 may generate a raw digital image. The image processing electronics 2115 can read the original image and will typically perform at least some basic image processing functions. The source approximator 2120 may receive the image from the image processing electronics 2115 and perform an inference operation to output a vector that approximates a representation of the topology that appears in the pattern of activity in the relatively complex neural network. This approximation vector is input into a representation classifier 2125, which representation classifier 2125 determines whether the approximation vector satisfies one or more sets of classification criteria. Examples include facial recognition and other machine vision operations. In the event that the representation classifier 2125 determines that the approximation vector meets a set of classification criteria, the representation classifier 2125 may instruct the communication controller and interface 2130 to transmit information about the image. For example, the communication controller and interface 2130 may transmit the images themselves, classifications, and/or other information about the images.

Sometimes, it may be desirable to change the classification process. In these cases, the communication controller and interface 2130 may receive a training set. In some implementations, the training set may include raw or processed image data and representations of topologies that occur in patterns of activity in relatively complex neural networks. Such training sets may be used to retrain the source approximator 2120, for example, using supervised learning or reinforcement learning techniques. In particular, the representation is used as a target answer vector and represents the desired result of the source approximator 2120 processing the raw or processed image data.

In other implementations, the training set may include representations of topologies that occur in patterns of activity in relatively complex neural networks, as well as desired classifications of those representations of topologies. Such training sets may be used to retrain the neural network representation classifier 2125, for example, using supervised learning or reinforcement learning techniques. In particular, the desired classification is used as the target answer vector and represents the desired result of the representation of the processing topology of the representation classifier 2125.

Regardless of whether the source approximator 2120 or the representation classifier 2125 is retrained, the inference operations at the device 2100 can be readily adapted to changing situations and goals without requiring a large training data set and time-intensive and computationally-intensive iterative training.

Fig. 22 is a schematic illustration of a second edge device 2200 that includes a local artificial neural network that can be trained using representations of occurrences of topologies corresponding to activity in the source neural network. In the illustrated implementation, the second-side device 2200 is schematically illustrated as a mobile computing device such as a smart phone or tablet computer. Device 2200 includes an optical imaging system (e.g., on the back side of device 2200, not shown), image processing electronics 2215, representation classifier 2225, communication controller and interface 2230, and data port 2235. These components may have features and perform actions corresponding to those of the optical imaging system 2110, the image processing electronics 2115, the presentation classifier 2125, the communication controller and interface 2130, and the data port 2135 in the device 2100 (fig. 21).

The illustrated implementation of the apparatus 2200 additionally includes one or more additional sensors 2240 and a multiple input source approximator 2245. The sensor 2240 may sense one of a number of characteristics of the environment surrounding the device 2200 or the device 2200 itself. For example, in some implementations, the sensor 2240 may be an accelerometer that senses the acceleration experienced by the device 2200. As another example, in some implementations, the sensor 2240 may be an acoustic sensor, such as a microphone, that senses noise in the environment of the device 2200. Yet another example of a sensor 2240 includes a chemical sensor (e.g., an "artificial nose" or the like), a humidity sensor, a radiation sensor, or the like. In some cases, the sensor 2240 is coupled to processing electronics that can read the output of the sensor 2240 (or other information, such as, for example, a contact list or map) and perform basic processing functions. Thus, different implementations of sensor 2240 may have different "forms (modality)", as the physical parameters that are physically sensed vary from sensor to sensor.

The multiple-input source approximator 2245 is a relatively simple neural network trained to output vectors that approximate a representation of the topology that appears in the pattern of activity in the relatively complex neural network. For example, the multiple-input source approximator 2245 may be trained to approximate a recurrent source neural network, such as, for example, a recurrent neural network that mimics a biological system and includes the extent of morphological, chemical, and other features of the biological system.

Unlike the source approximator 2120, the multiple-input source approximator 2245 is coupled to receive raw or processed sensor data from the plurality of sensors and return an approximation of a representation of the topology that appears in an active pattern in the relatively complex neural network based on the data. For example, the multi-input source approximator 2245 may receive processed image data from the image processing electronics 2215 as well as acoustic, acceleration, chemical, or other data, for example, from one or more sensors 2240. The multiple-input source approximator 2245 may be, for example, a deep neural network, such as a convolutional neural network including a convolutional layer, a pooled layer, and a fully-connected layer. The computations performed by the different layers in the multiple-input source approximator 2245 may be specific to a single type of sensor data or multiple forms of sensor data.

Regardless of the particular organization of the multiple-input source approximator 2245, the multiple-input source approximator 2245 will return approximations based on raw or processed sensor data from the plurality of sensors. For example, the processed image data from the image processing electronics 2215 and the acoustic data from the microphone sensor 2240 may be used by the multiple-input source approximator 2245 to approximate a representation of the topology that would appear in the pattern of activity in a relatively complex neural network that receives the same data.

Sometimes, it may be desirable to change the classification process at the device 2200. In these cases, communication controller and interface 2230 may receive a training set. In some implementations, the training set may include raw or processed images, sound, chemical or other data, as well as representations of topologies that occur in patterns of activity in relatively complex neural networks. Such training sets may be used to retrain the multiple-input source approximator 2245, for example, using supervised learning or reinforcement learning techniques. In particular, the representation is used as a target answer vector and represents the desired result of the multi-input source approximator 2245 processing the raw or processed image or sensor data.

In other implementations, the training set may include representations of topologies that occur in patterns of activity in relatively complex neural networks, as well as desired classifications of those representations of topologies. Such training sets may be used to retrain the neural network representation classifier 2225, for example, using supervised learning or reinforcement learning techniques. In particular, the desired classification is used as the target answer vector and represents the desired result of the representation of the processing topology of the representation classifier 2225.

Regardless of whether the multiple-input source approximator 2245 or the representation classifier 2225 is retrained, the inference operations at the device 2200 can be easily adapted to changing situations and goals without the need for large training data sets and time-intensive and computationally-intensive iterative training.

Fig. 23 is a schematic illustration of a system 2300 in which a representation of the occurrence of a topology corresponding to activity in a source neural network may be used to train a local neural network. The target neural network is implemented on a relatively simple, less expensive data processing system, while the source neural network may be implemented on a relatively complex, more expensive data processing system.

The system 2300 includes a variety of devices 2305 having local neural networks, a telephone base station 2310, a wireless access point 2315, a server system 2320, and one or more data communication networks 2325.

The local neural network device 2305 is a device configured to process data using a computationally-lower-intensive target neural network. As illustrated, the local neural network device 2305 may be implemented as any one of a mobile computing device, a camera, an automobile, or a host of other appliances, fixtures, and moving parts, as well as different brands and models of devices within each category. Different local neural network devices 2305 may belong to different owners. In some implementations, access to the data processing functions of the local neural network device 2305 will typically be limited to these owners and/or their designations.

The local neural network devices 2305 may each include one or more source approximators trained to output vectors that approximate representations of topologies that occur in patterns of activity in relatively complex neural networks. For example, the relatively complex neural network may be a cyclic source neural network, such as, for example, a cyclic neural network that mimics a biological system and includes the extent of morphological, chemical, and other features of the biological system.

In some implementations, in addition to processing data using a source approximator, the local neural network device 2305 may be programmed to retrain the source approximator using a representation of the topology that appears in the pattern of activity in the relatively complex neural network as a target answer vector. For example, the local neural network device 2305 may be programmed to perform one or more iterative training techniques (e.g., gradient descent or random gradient descent). In other implementations, the source approximator in the local neural network device 2305 is trainable by, for example, a dedicated training system or by a training system installed on a personal computer that may interact with the local neural network device 2305 to train the source approximator.

Each local neural network device 2305 includes one or more wireless or wired data communication components. In the illustrated implementation, each local neural network device 2305 includes at least one wireless data communication component, such as a mobile telephone transceiver, a wireless transceiver, or both. The mobile phone transceiver can exchange data with a phone base station 2310. The wireless transceiver can exchange data with a wireless access point 2315. Each local neural network device 2305 may also be capable of exchanging data with a peer mobile computing device.

The telephone base station 2310 and the wireless access point 2315 are connected for data communication with one or more data communication networks 2325 and can exchange information with the server system 2320 via the networks. Thus, the local neural network device 2305 is also typically in data communication with the server system 2320. However, this is not necessarily the case. For example, in implementations where the local neural network device 2305 is trained by other data processing devices, the local neural network device 2305 need only be in data communication with these other data processing devices at least once.

Server system 2320 is a system of one or more data processing devices that are programmed to perform data processing activities in accordance with one or more sets of machine-readable instructions. The activity may include providing a training set to a training system for the mobile computing device 2305. As discussed above, the training system may be internal to the mobile local neural network device 2305 itself or on one or more other data processing devices. The training set may include representations of occurrences of topologies corresponding to activity in the source neural network and corresponding input data.

In some implementations, the server system 2320 also includes a source neural network. However, this is not necessarily the case, and the server system 2320 may receive a training set from yet another system of data processing devices implementing the source neural network.

In operation, after the server system 2320 receives the training set (from the source neural network found at the server system 2320 itself or elsewhere), the server system 2320 may provide the training set to a trainer that trains the mobile computing device 2305. The source approximator in the target local neural network device 2305 may be trained using a training set to cause the target neural network to approximate the operation of the source neural network.

Fig. 24, 25, 26, 27 are schematic illustrations of representations of occurrences of a topology in an activity using a neural network in four different systems 2400, 2500, 2600, 2700. The system 2400, 2500, 2600, 2700 can be configured to perform any of a number of different operations. For example, the systems 2400, 2500, 2600, 2700 can perform object positioning operations, object detection operations, object segmentation operations, object detection operations, prediction operations, action selection operations, and so forth.

The object positioning operation positions an object within the image. For example, a bounding box may be constructed around the object. In some cases, object localization may be combined with object recognition in which the localized object is labeled with an appropriate name (designation).

The object detection operation classifies image pixels as belonging to a particular class (e.g., belonging to an object interest) or not belonging to a particular class. In general, object detection is performed by grouping pixels and forming a bounding box around the group of pixels. The bounding box should fit tightly around the object.

Object segmentation typically assigns class labels to each image pixel. Thus, object segmentation is performed on a pixel-by-pixel basis and typically requires that only a single label be assigned to each pixel, rather than a bounding box.

The predictive operation seeks to conclude that it is outside the range of observed data. While predictive operations may seek to predict future occurrence (e.g., based on information about past and current states), predictive operations may also seek to draw conclusions about past and current states based on incomplete information about these states.

The action selection operation seeks to select an action based on a set of conditions. Action selection operations have traditionally been broken down into different approaches such as symbol-based systems (classical planning), distributed solutions, and reactive or dynamic planning.

Classification systems 2400, 2500 each perform a desired operation on a representation of a pattern of activity in a neural network. The systems 2600, 2700 each perform a desired operation on an approximation of the representation of the pattern of activity in the neural network. In the systems 2400, 2500, the pattern of activity represented occurs in a source neural network device 1705 that is part of the systems 2400, 2500 and is read from the source neural network device 1705. In contrast, in systems 2400, 2500, the pattern of activity that is approximately represented occurs in a source neural network device that is not part of systems 2400, 2500. However, an approximation of the representation of those modes of activity is read from approximator 1905 that is part of system 2400, 2500.

Turning in more detail to fig. 24, system 2400 includes a source neural network 1705 and a linear processor 2410. The linear processor 2410 is a device that performs operations based on linear combinations of features of representations (or approximations of such representations) of patterns of activity in a neural network. The operation may be, for example, an object locating operation, an object detecting operation, an object dividing operation, an object detecting operation, a predicting operation, an action selecting operation, or the like.

Linear processor 2410 includes an input 2420 and an output 2425. An input 2420 is coupled to receive a representation of the pattern of activity in the source neural network 1705. The linear processor 2410 may receive a representation of the pattern of activity in the source neural network 1705 in a variety of ways. For example, the representation of the pattern of activity may be received as discrete events or as a continuous stream over a real-time or non-real-time communication channel. Output 2525 is coupled to output the processing results from linear processor 2410. In some implementations, the linear processor 2410 may be implemented on one or more computing devices with relatively limited computing capabilities. For example, the linear processor 2410 may be implemented on a personal computer or mobile computing device such as a smart phone or tablet computer.

Turning to fig. 24, system 2400 includes a source neural network 1705 and a linear processor 2410. The linear processor 2410 is a device that performs operations based on linear combinations of features of representations (or approximations of such representations) of patterns of activity in a neural network. The operation may be, for example, an object locating operation, an object detecting operation, an object dividing operation, a predicting operation, an action selecting operation, or the like.

In fig. 25, classification system 2500 includes a source neural network 1705 and a neural network 2510. The neural network 2510 is a neural network device configured to perform operations based on a nonlinear combination of features of representations (or approximations of such representations) of patterns of activity in the neural network. The operation may be, for example, an object locating operation, an object detecting operation, an object dividing operation, a predicting operation, an action selecting operation, or the like. In the illustrated implementation, the neural network 2510 is a feed-forward network that includes an input layer 2520 and an output layer 2525. As with the linear processor 2410, the neural network 2510 may receive a representation of the pattern of activity in the source neural network 1705 in a variety of ways.

In some implementations, the neural network 2510 may perform inferences on one or more computing devices having relatively limited computing capabilities. For example, the neural network 2510 may be implemented on a personal computer or mobile computing device such as a smart phone or tablet computer, for example, in a neural processing unit of such a device. Like system 2400, system 2500 will typically be a decentralized system in which remote neural network 2510 communicates with source neural network 1705, such as via a data communication network. In some implementations, the neural network 2510 may be, for example, a deep neural network, such as a convolutional neural network.

In fig. 26, system 2600 includes a source approximator 1905 and a linear processor 2410. The processor 2410 may perform operations on the approximation 1200 'despite any differences between the approximation 1200' and the representation 1200.

In fig. 27, system 2700 includes source approximator 1905 and neural network 2510. Despite any differences between the approximation 1200 'and the representation 1200, the neural network 2510 may still perform operations on the approximation 1200'.

In some implementations, the systems 2600, 2700 can be implemented on an edge device, such as, for example, edge devices 2100, 2200 (fig. 21, 22). In some implementations, the systems 2600, 2700 can be implemented as part of a system, such as the system 2300 (fig. 23), in which the local neural network can be trained using representations of occurrences of topologies corresponding to activity in the source neural network.

FIG. 28 is a schematic illustration of a reinforcement learning system 2800 that includes an artificial neural network that can be trained using representations of occurrences of topologies corresponding to activities in the source neural network. Reinforcement learning is a type of machine learning in which an artificial neural network learns from feedback about the results of actions taken in response to decisions by the artificial neural network. The reinforcement learning system moves from one state to another state in the environment by performing actions and receiving information characterizing the new state and rewards and/or regrets characterizing the success (or lack of success) of the actions. Reinforcement learning seeks to maximize (or unfortunately minimize) the total rewards through the learning process.

In the illustrated implementation, the artificial neural network in the reinforcement learning system 2800 is a deep neural network 2805 (or other deep learning architecture) trained using reinforcement learning methods. In some implementations, the deep neural network 2805 may be a local artificial neural network, such as neural network 2510 (fig. 25, 27), and implemented locally on, for example, an automobile, aircraft, robot, or other device. However, this is not necessarily the case, and in other implementations, the deep neural network 2805 may be implemented on a system of networked devices (networked device).

In addition to the source approximator 1905 and the deep neural network 2805, the reinforcement learning system 2800 also includes an actuator 2810, one or more sensors 2815, and a teacher module 2820. In some implementations, the reinforcement learning system 2800 also includes one or more additional data sources 2825.

Actuator 2810 is a device that controls a mechanism or system that interacts with the environment 2830. In some implementations, the actuator 2810 controls a physical mechanism or system (e.g., steering of an automobile or positioning of a robot). In other implementations, the actuator 2810 can control a virtual mechanism or system (e.g., a virtual game board or portfolio (INVESTMENT PORTFOLIO)). Thus, the environment 2830 may also be physical or virtual.

The sensor 2815 is a device that measures characteristics of the environment 2830. At least some of the measurements characterize interactions between the controlled mechanism or system and other aspects of the environment 2830. For example, as the actuator 2810 maneuvers the automobile, the sensor 2815 may measure one or more of speed, direction, and acceleration of the automobile, proximity of the automobile to other features, and response of other features to the automobile. As another example, when the actuator 2810 controls an investment portfolio, the sensor 2815 may measure a value and risk associated with the portfolio.

Typically, both source approximator 1905 and teacher module 2820 are coupled to receive at least some of the measurements obtained by sensor 2815. For example, the source approximator 1905 may receive measurement data at the input layer 1915 and output an approximation 1200' of a representation of the topology that appears in the pattern of activity in the source neural network.

Teacher module 2820 is a device configured to interpret measurements received from sensors 2815 and provide rewards and/or regrets to deep neural network 2805. The reward is positive and indicates successful control of the institution or system. Unfortunately, this is negative and indicates unsuccessful or less than optimal control. Typically, the teacher module 2820 also provides characterization of the measurement results and rewards/regrets for reinforcement learning. Typically, the characterization of the measurement is an approximation (such as approximately 1200') of the representation of the topology that appears in the pattern of activity in the source neural network. For example, the teacher module 2820 may read the approximation 1200 'output from the source approximator 1905 and pair the read approximation 1200' with the corresponding rewards/regrets.

In implementations, reinforcement learning occurs in the system 2800 in non-real time or during active control of the actuator 2810 by the deep neural network 2805. Conversely, training feedback may be collected by teacher module 2820 and used to enhance training when deep neural network 2805 is not actively indicating actuator 2810. For example, in some implementations, the teacher module 2820 may be remote from the deep neural network 2805 and only intermittently data communicate with the deep neural network 2805. Regardless of whether reinforcement learning is intermittent or continuous, the deep neural network 2805 may be evolved, for example, to use information received from the teacher module 2820 to optimize rewards and/or reduce regrets.

In some implementations, the system 2800 also includes one or more additional data sources 2825. The source approximator 1905 may also receive data from a data source 2825 at an input layer 1915. In these examples, the approximation 1200' will result from processing both the sensor data and the data from the data source 2825.

In some implementations, data collected by one reinforcement learning system 2800 may be used for training or reinforcement learning of other systems, including other reinforcement learning systems. For example, the characterization of the measurement results and the rewards/regrets values may be provided by the teacher module 2820 to a data exchange system that collects such data from a variety of reinforcement learning systems and redistributes the data among them. Furthermore, as discussed above, the characterization of the measurement may be an approximation, such as approximation 1200', of the representation of the topology that appears in the pattern of activity in the source neural network.

The particular operations performed by reinforcement learning system 2800 will, of course, depend on the particular operating scenario. For example, in a scenario where the source approximator 1905, the deep neural network 2805, the actuator 2810, and the sensor 2815 are part of an automobile, the deep neural network 2805 may perform object localization and/or detection operations while the automobile is being maneuvered.

In implementations where the data collected by the reinforcement learning system 2800 is used for training or reinforcement learning of other systems, rewards/regrets and approximations 1200' that characterize the state of the environment when performing object localization and/or detection operations may be provided to the data exchange system. The data exchange system may then assign the reward/regrind values and approximations 1200' to other reinforcement learning systems 2800 associated with other vehicles for reinforcement learning at these other vehicles. For example, reinforcement learning may be used to improve object localization and/or detection operations at the second vehicle using the reward/regretrate values and the approximation 1200'.

However, the operations learned at other vehicles need not be the same as those performed by the deep neural network 2805. For example, an approximation 1200' based on the reward/regret value of travel time and the input of sensor data characterizing unexpectedly wet roads at locations identified, for example, by the GPS data source 2825, may be used for a route planning operation at another vehicle.

Embodiments of the operations and subject matter described in this specification can be implemented in digital electronic circuitry, or in computer software, firmware, or hardware, including the structures disclosed in this specification and their structural equivalents, or in combinations of one or more of them. Embodiments of the subject matter described in this specification can be implemented as one or more computer programs, i.e., one or more modules of computer program instructions, encoded on a computer storage medium for execution by, or to control the operation of, data processing apparatus. Alternatively or additionally, the program instructions may be encoded on a manually-generated propagated signal (e.g., a machine-generated electrical, optical, or electromagnetic signal) that is generated to encode information for transmission to suitable receiver apparatus for execution by data processing apparatus. The computer storage medium may be or be included in a computer readable storage device, a computer readable storage substrate, a random or serial access memory array or device, or a combination of one or more of them. Furthermore, while the computer storage medium is not a propagated signal, the computer storage medium may be a source or destination of computer program instructions encoded in an artificially generated propagated signal. Computer storage media may also be or be included in one or more separate physical components or media, such as a plurality of CDs, discs (disks), or other storage devices.

The operations described in this specification may be implemented as operations performed by data processing apparatus on data stored on one or more computer readable storage devices or received from other sources.

The term "data processing apparatus" encompasses all kinds of apparatus, devices, and machines for processing data, including by way of example a programmable processor, a computer, a system-on-a-chip, or a combination of multiple ones or the foregoing. The apparatus may comprise a dedicated logic circuit, for example an FPGA (field programmable gate array) or an ASIC (application specific integrated circuit). In addition to hardware, the apparatus may include code that creates an execution environment for the computer program in question, e.g., code that constitutes processor firmware, a protocol stack, a database management system, an operating system, a cross-platform runtime environment, a virtual machine, or a combination of one or more of them. The apparatus and execution environment may implement a variety of different computing model infrastructures, such as web services, distributed computing, and grid computing infrastructures.

A computer program (also known as a program, software application, script, or code) can be written in any form of programming language, including compiled or interpreted languages, declarative or procedural languages, and it can be deployed in any form, including as a stand-alone program or as a module, component, subroutine, object, or other unit suitable for use in a computing environment. The computer program may, but need not, correspond to a file in a file system. A program can be stored in a portion of a file (e.g., one or more scripts stored in a markup language document), a single file dedicated to the program in question, or in multiple coordinated files (e.g., files that store one or more modules, sub-programs, or portions of code) that store other programs or data. A computer program can be deployed to be executed on one computer or on multiple computers at one site or distributed across multiple sites and interconnected by a communication network.

The processes and logic flows described in this specification can be performed by one or more programmable processors executing one or more computer programs to perform actions by operating on input data and generating output. The processes and logic flows can also be performed by, and apparatus can also be implemented as, special purpose logic circuitry, e.g., an FPGA (field programmable gate array) or an ASIC (application-specific integrated circuit).

Processors suitable for the execution of a computer program include, by way of example, both general and special purpose microprocessors, and any one or more processors of any kind of digital computer. Generally, a processor will receive instructions and data from a read-only memory or a random access memory or both. The essential elements of a computer are a processor for performing actions in accordance with instructions and one or more memory devices for storing instructions and data. Typically, a computer will also include, or be operatively coupled to receive data from, or transfer data to, or both, one or more mass storage devices for storing data (e.g., magnetic, magneto-optical disks, or optical disks). However, the computer need not have such a device. Moreover, a computer may be embedded in another device (e.g., a mobile telephone, a Personal Digital Assistant (PDA), a mobile audio or video player, a game controller, a Global Positioning System (GPS) receiver, or a portable storage device (e.g., a Universal Serial Bus (USB) flash drive), to name a few devices suitable for storing computer program instructions and data include all forms of non-volatile memory, media, and memory devices, including by way of example semiconductor memory devices, e.g., EPROM, EEPROM, and flash memory devices; magnetic disks, e.g., internal hard disks or removable disks; the processor and memory may be supplemented by, or incorporated in, special purpose logic circuitry.

To provide for interaction with a user, embodiments of the subject matter described herein can be implemented on a computer having a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to the user and a keyboard and a pointing device (e.g., a mouse or a trackball) by which the user can provide input to the computer. Other types of devices may also be used to provide interaction with the user, for example, feedback provided to the user may be any form of sensory feedback, e.g., visual feedback, auditory feedback, or tactile feedback, and may receive input from the user in any form, including acoustic, speech, or tactile input. In addition, the computer may interact with the user by sending and receiving documents to and from the device used by the user, for example, by sending web pages to a web browser on the user's client device in response to requests received from the web browser.

While this specification contains many specific implementation details, these should not be construed as limitations on the scope of any invention or of what may be claimed, but rather as descriptions of features specific to particular embodiments of particular inventions. Certain features that are described in this specification in the context of separate embodiments can also be implemented in combination in a single embodiment. Conversely, various features that are described in the context of a single embodiment can also be implemented in multiple embodiments separately or in any suitable subcombination. Furthermore, although features may be described above as acting in certain combinations and even initially claimed as such, one or more features from a claimed combination can in some cases be excised from the combination, and the claimed combination may be directed to a subcombination or variation of a subcombination.

Similarly, although operations are depicted in the drawings in a particular order, this should not be understood as requiring that such operations be performed in the particular order shown or in sequential order, or that all illustrated operations be performed, to achieve desirable results. In some cases, multitasking and parallel processing may be advantageous. Moreover, the separation of various system components in the embodiments described above should not be understood as requiring such separation in all embodiments, and it should be understood that the described program components and systems can generally be integrated together in a single software product or packaged into multiple software products.

Thus, particular implementations of the subject matter have been described. Other implementations are within the scope of the following claims. In some cases, the actions recited in the claims can be performed in a different order and still achieve desirable results. In addition, the processes depicted in the accompanying figures do not necessarily require the particular order shown, or sequential order, to achieve desirable results. In some implementations, multitasking and parallel processing may be advantageous.

Various embodiments have been described. However, it should be understood that various modifications may be made. For example, while representation 1200 is a binary representation in which each bit individually represents the presence or absence of a feature in the chart, other representations of information are possible. For example, a vector or matrix of multi-valued, non-binary digits may be used to represent the presence or absence of features, for example, and possibly other features of those features. One example of such a feature is the weight of the active edge that constitutes the feature.

Accordingly, other embodiments are within the scope of the following claims.

Claims

1. A method performed by a data processing apparatus, the method comprising:

encoding an input video or audio data signal by characterizing a signaling activity in a recurrent artificial neural network and outputting a characterization of the signaling activity, the signaling activity in the recurrent artificial neural network being responsive to the input video or audio data signal, wherein in the recurrent artificial neural network nodes operate as accumulators, the method is performed by a data processing apparatus and comprises identifying a cluster pattern of signaling activity of the recurrent artificial neural network, wherein the cluster pattern of signaling activity encloses a cavity and the characterization of the signaling activity is indicative of the presence or absence of the cluster pattern without encoding a position of the cluster pattern in a graph of the recurrent artificial neural network.

2. The method of claim 1, wherein the method further comprises defining a plurality of time windows during which the signaling activity of the recurrent artificial neural network is responsive to the input video or audio data signal, wherein the bolus pattern of signaling activity is identified in each of the plurality of time windows.

3. The method of claim 2, wherein the method further comprises identifying a first time window within the plurality of time windows based on a distinguishable likelihood of the bolus pattern of signaling activity occurring during the first time window.

4. The method of claim 1, wherein identifying a clique pattern comprises identifying a directed clique of signaling activity.

5. The method of claim 4, wherein identifying a directed clique comprises discarding or ignoring lower-dimensional directed cliques present in higher-dimensional directed cliques.

6. The method of claim 1, further comprising:

classifying the clique patterns into categories, and

The signaling activity is characterized according to the number of occurrences of the clique mode in a corresponding one of the categories.

7. The method of claim 6, wherein classifying the clique patterns comprises classifying the clique patterns according to a number of nodes within each clique pattern.

8. The method of claim 1, further comprising outputting a binary sequence of 0 and 1 from the recurrent artificial neural network, wherein each number in the sequence represents whether there are respective patterns of signaling activity in three or more nodes in the artificial neural network.

9. The method of claim 8, further comprising:

structuring the recurrent artificial neural network, including

Reading the number output from the recurrent artificial neural network, and

Evolving a structure of the recurrent artificial neural network, wherein evolving the structure of the recurrent artificial neural network comprises:

The structure is changed in an iterative manner,

Characterizing the complexity of the pattern of signaling activity in a changed structure, and

The characterization of the complexity of the pattern is used as an indication of whether a changed structure is desired.

10. The method of claim 1, wherein

The method further comprises the steps of:

Identifying a decision time in the recurrent artificial neural network based on a determination of a complexity of a pattern of signaling activity in the recurrent artificial neural network, wherein a decision time is a point in time at which signaling activity in the recurrent artificial neural network indicates a completion result of information processing by the recurrent artificial neural network in response to the input video or audio data signal, the identification of a decision time comprising

Determining a specific time of a signaling activity having a complexity distinguishable from other signaling activities responsive to the input video or audio data signal, and

The decision moment and a particular time reading of output from the recurrent artificial neural network are identified based on the particular time of the signaling activity having distinguishable complexity.

11. The method of claim 10, further comprising inputting the input video or audio data signal as a data stream into the recurrent artificial neural network and identifying the bolus pattern of signal transmission activity during the input of the data stream.

12. The method of claim 1, further comprising evaluating whether the signaling activity is responsive to the input video or audio data signal, the evaluating comprising:

Evaluating that a simpler mode of signaling activity occurring earlier after an input event is responsive to said input video or audio data signal but that a more complex mode of signaling activity occurring earlier after said input event is not responsive to said input video or audio data signal, and

A more complex mode of evaluating signaling activity occurring later after the input event is responsive to the input video or audio data signal but a simpler mode of signaling activity occurring later after the input event is not responsive to the input video or audio data signal,

Wherein the simpler mode of signaling activity involves more nodes than the more complex mode of signaling activity.

13. An encoder comprising one or more computers operable to perform operations, the operations comprising the method of any preceding claim.

14. A signal transmission system comprising an encoder according to claim 13.

15. A data storage system comprising an encoder according to claim 13.