Los científicos del laboratorio australiano Cortical Labs crearon un chip con neuronas humanas, dijo este viernes a New Scientist el jefe del proyecto, Brett Kagan. El sistema, apodado ‘Dish Brain’, consiste en una placa de Petri con células cerebrales cultivadas encima de una matriz de microelectrodos capaces de estimularlas y detectar las señales. «Mediante la estimulación y el registro electrofisiológico, los cultivos se insertaron en un mundo de juego simulado, imitando el juego de árcade ‘Pong'», describen los científicos en un estudio al respecto, que fue subido en el servidor de preimpresión BioRxiv a inicios de diciembre y todavía no ha sido revisado por pares.
Integrating neurons into digital systems to leverage their innate intelligence may enable performance infeasible with silicon alone, along with providing insight into the cellular origin of intelligence. We developed DishBrain, a system which exhibits natural intelligence by harnessing the inherent adaptive computation of neurons in a structured environment. In vitro neural networks from human or rodent origins, are integrated with in silico computing via high-density multielectrode array. Through electrophysiological stimulation and recording, cultures were embedded in a simulated game-world, mimicking the arcade game ‘Pong’. Applying a previously untestable theory of active inference via the Free Energy Principle, we found that learning was apparent within five minutes of real-time gameplay, not observed in control conditions. Further experiments demonstrate the importance of closed-loop structured feedback in eliciting learning over time. Cultures display the ability to self-organise in a goal-directed manner in response to sparse sensory information about the consequences of their actions.
Harnessing the computational power of living neurons to create synthetic biological intelligence (SBI), previously confined to the realm of science fiction, is now tantalisingly within the reach of human innovation. The superiority of biological computation has been widely recognised with attempts to develop hardware supporting neuromorphic computing. Yet, no system outside biological neurons are capable of supporting at least third-order complexity which is necessary to recreate the complexity of a biological neuronal network (BNN). This raises significant challenges to any attempts to generate in silico neuronal models to predict function of BNN systems. Here we aim to establish functional in vitro networks of cortical cells from embryonic rodent and human induced pluripotent stem cells (hiPSCs) on high-density multielectrode arrays (HD-MEA) to demonstrate that these neural cultures can exhibit biological intelligence—as evidenced by learning in a simulated gameplay environment—in real time. Being able to successfully interact with SBIs would enable investigations into previously untestable areas. This would include, but not limited to, pseudo-cognitive responses as part of drug screening, bridging the divide between single cell and population coding approaches to understanding neurobiology, better understanding how BNNs compute to inform machine learning approaches, and potentially give rise to silico-biological computational platforms that surpass the performance of existing silicon-alone hardware. Indeed, some proponents suggest that generalised SBI may arrive before artificial general intelligence (AGI) due to the inherent efficiency and evolutionary advantage of biological systems.
This system that we termed DishBrain, can leverage the inherent property of neurons to share a ‘language’ of electrical (synaptic) activity with each other to link silicon and BNN systems through electrical stimulation and recording. Given the compatibility of hardware and cells, wetware, there are two interrelated processes that are required for sentient behaviour in an intelligent system. Firstly, the system must learn how external states influence internal states—via perception—and how internal states influence external states—via action. Secondly, the system must infer from its sensory states when it should adopt a particular behaviour. In short, it must be able to predict how its actions will influence the environment. To address the first imperative, custom software drivers were developed to create low latency closed-loop feedback systems that simulated exchange with an environment for BNNs through electrical stimulation. Closed-loop systems afford an in vitro culture ‘embodiment’ by providing feedback on the causal effect of the behaviour from the cell culture. Embodiment requires a separation of internal vs external states, where feedback of the effect of action on a given environment is available. Previous works both in vitro and in silico have shown that electrophysiological closed-loop feedback systems engender significant network plasticity and potentially behavioural adaptation over and beyond what can be achieved with open-loop systems. Further support for the link between embodiment and functional behaviour is found in vivo where disrupting a closed loop system by uncoupling visual feedback and motor outputs disrupts functional development of visual processing in the primary visual cortex in mice. This strongly supports the vital link between feedback and the eventual development of functional behaviour in biological neural networks.
To address the second requirement, the system can be used to test key theories for how intelligent behaviour may arise. One proposition for how intelligent behaviour may arise in an intelligent system embodied in an environment is found in the theory of active inference via the Free Energy Principle (FEP). Previous work has established that neurons can perform blind-source separation via a state-dependent Hebbian plasticity that is consistent with the FEP. The FEP suggests that any self-organising system separate from its environment seeks to minimizes its variational free energy. This means that systems like the brain—at every spatiotemporal scale— may engage in active inference by using an internal generative model to predict sensory inputs that represent the external world. The gap between the model predictions and observed sensations (‘surprise’ or ’prediction error’) may be minimised in two ways: by optimising probabilistic (Bayesian) beliefs about the environment to make predictions more like sensations, or by acting upon the environment to make sensations conform to its predictions. This implies a common objective function for action and perception that scores the fit between an internal model and the external environment.
Under this theory, BNNs hold ‘beliefs’ about the state of the world, where learning involves updating these beliefs to minimise their variational free energy or change the world, by action, to make it less surprising. If true, this implies that it should be possible to shape BNN behaviour by simply presenting noisy, unpredictable feedback following ’incorrect’ behaviour. If BNNs are presented with unpredictable feedback, they should adopt actions that avoid the states that resulted in this input. By developing a system that allows for neural cultures to be embodied in a simulated game-world, we are not only able to test whether these cells are capable of engaging in goal-directed learning in a dynamic envrionment, we are able to investigate a fundemental basis of intelligence.
Growth of neuronal ‘wetware’ for computation
Neurons can be grown or harvested in numerous ways. Cortical cells from the dissected cortices of rodent embryos can be grown on MEA in nutrient rich medium and maintained for months. These cultures will develop complicated morphology, with numerous dendritic and axonal connections, leading to functional BNNs. We successfully replicated the development of these cultures from embryonic day 15.5 (E15.5) mouse embryos, with a representative culture shown in. We also differentiated human induced pluripotent stem cells (hiPSCs) into monolayers of active heterogeneous cortical neurons which have been shown to display mature functional properties. Using a dual SMAD inhibition as previously described we developed long-term cortical neurons that formed dense connections with supporting glial cells. Finally, we wished to expand our study using a different method of hiPSC differentiation – NGN2 direct reprogramming – used in our final part of this study. Previous work has shown that human fibroblasts can be directly converted into induced neuronal cells which express a cortical phenotype. This high yield method was replicated in this work with cells displaying pan-neuronal markers. These cells typically display a high proportion of excitatory glutamatergic cells, quantified using qPCR shown in. Integration of these cells on the HD-MEA was confirmed via scanning electron microscopy (SEM) where cells had been maintained for > 3 months. Routine SEM imaging revealed dense clustering of neurons, with a clear contrast between cell and the MEA surface. Densely interconnected dendritic networks could be observed in neuronal cultures forming interlaced networks spanning the MEA area. These neuronal cultures appeared to rarely follow the topography of the MEA and were more likely to form large clusters of connected cells with dense dendritic networks. This is likely due to the large size of an individual electrode within the MEA, however, there are also chemotactic effects that can contribute to counteract the effect of substrate topography on neurite projections.
Neural cells show well-characterised spontaneous action potentials which develop over time
We mapped the in vitro development of electrophysiological activity in neural systems at high spatial and temporal resolution. Robust activity in primary cortical cells from E15.5 rodents was found at days in vitro (DIV)14 where bursts of synchronised activity was regularly observed as previously demonstrated. In contrast, while similar to previous reports, synchronised bursting activity was not observed in cortical cells from an hiPSC background differentiated using the dual SMAD inhibition protocol (DSI) until DIV 73. hiPSCs differentiated using NGN2 direct reprogramming showed activity much earlier, typically between days 14 and 24. Daily activity scans of electrophysiological activity were also conducted. While max firing rate typically increased and remained relatively stable over time for all cell types during the testing period, changes were observed in both the mean firing rate and variance in firing rate over the days of testing. In particular, hiPSCs differentiated using the NGN2 direct reprogramming method showed a considerable increase in mean firing rate and the variance in firing over days.
Building a modular, real-time platform to harness neuronal computation
We developed the DishBrain system to leverage neuronal computation and interact with neurons in an embodied environment. The DishBrain environment is a low latency, real-time system that interacts with the vendor MaxOne software, allowing it to be used in ways that extend its original functions. This system can record electrical activity in a neuronal culture and provide external (non-invasive) electrical stimulation in a comparable manner to the generation of action potentials by internal electrical stimulation. Using the coding schemes described in methods, external electrical stimulations convey a range of information: predictable, random, or sensory. This setup enables one to not only ‘read’ information from a neural culture, but to ‘write’ sensory data into one. The initial proof of principle using DishBrain was to simulate the classic arcade game ‘pong’ by delivering inputs to a predefined sensory area. Similarly, the electrophysiological activity of pre-defined motor regions was gathered—in real time—to move a ‘paddle’. Preliminary investigations compared different motor region configurations using an EXP3 algorithm. This aimed to identify whether the neural cultures had activity that was more successful under specific configurations by choosing setups that resulted in a higher hit rate. Experimental cultures showed significantly different preferences for configurations compared to media-only controls. While media-only controls showed a preference for configurations that maximised bias—where sensory stimulation alone could direct the gameplay towards higher performance (blinded fully later)—experimental cultures showed a preference for configuration that enabled lateral inhibition.
Increasing the density of sensory information input leads to increased performance
The DishBrain protocol was refined over a series of pilot studies, which can be grouped into three broad experiments, each increasing in density of sensory information. The first experiment operated with a 4Hz stimulation that was purely rate coded. Experiment two included the EXP3 based configuration. Experiment 3 removed the EXP3 based configuration, locked to the layout in, and change to the combined rate (4 – 40Hz) and placed coding method of data input. Notably the biggest increase was between second and third experiments with the introduction of rate coding of the ball’s position to supplement the purely rate coded approach used previously. The gameplay for the final fifteen minutes for each culture type was compared. Cultures displayed a significant increase in performance between the second and final session and the first and final session. Between cultures, human cortical cells (hCCs) had significantly longer average rally lengths than cultures with mice cortical cells (mCCs). This is interesting because it suggests that—at a neuronal level—cortical cells from a human origin, in this case from hiPSCs, can out-perform cells from an embryonic mouse, even when overall cell numbers are comparable. Overall, the magnitude of this change supports that increasing sensory information successively improved performance, even when cell culture features were kept constant.
Biological neural networks learn over time when embodied in a gameplay environment
To test the theory of active inference via the FEP, using the parameters described in Methods, cortical cells, mCC and hCCs, were compared to media-only controls (CTL), rest sessions—where active cultures controlled the paddle but received no sensory information (RST), and to in-silico controls that mimicked all aspect of the gameplay but paddle was driven by random noise (IS), over 399 test-sessions (80-CTL, 42-RST, 38-IS, 101-mCCs, 138-hCCs). The average rally length (total number of successful ball intercepts by the paddle) showed a significant interaction, with differences occurring based on a combination of group and time (first five and last fifteen minutes). Only the mCC and hCC cultures showed evidence of learning over time, with significantly increased rally lengths at the second timepoint compared to the first. Further, it was found that during the first five minutes of gameplay key significant differences were observed. The hCC group performed significantly worse than mCC, CTL and IS groups. This suggests that hCCs seem to perform worse than controls when first embodied in an environment, suggesting an initially maladaptive control of the paddle. Notably, at the latter timepoint this trend was reversed, the hCC group significantly outperformed all control groups along with a slight but significant difference over the mCC group. Likewise, the mCC group significantly outperformed all control groups. This data replicates our earlier finding on the differences between mouse and human cells, along with unambiguously demonstrating a significant learning effect in both experimental groups that was absent in the control groups (Movie S1).
Nuances between how learning occurs exhibits differences between cell types
To determine how the above learning arose, key gameplay characteristics were examined further. The number of times the paddle failed to intercept the ball on the initial serve (aces; Figure 6C) and the number of long rallies (> 3 consecutive hits; Figure 6D) were calculated for this data. As with average rally length, significant interactions between groups and time were found both for aces and long rallies. Only the mCC and hCC groups showed significantly fewer aces in the latter timepoint compared to the first. Likewise, only the mCC and hCC groups showed significantly more long rallies in the latter timepoint compared to the first. This shows that both experimental cultures improved performance by not only reducing how often they missed the initial serve, but by achieving more consecutive hits also. Similarly, a significant difference between groups was found both for aces and long rallies. At the first timepoint for aces, it was found that the RST condition had significantly more aces than the CTL and mCC groups. It is difficult to determine exactly why allowing the paddle to be controlled by unstimulated cells would result in more aces initially than other groups. Perhaps there is a degree of sporadic behaviour that the cells engage with when initially introduced to the rest period from gameplay that results in this behaviour, possibly similar to what was observed in average rally length by hCCs above. When the number of long-rallies at this time point was investigated, it was found that only HCC had significantly fewer long-rallies compared to all groups. This is consistent with the finding that hCC do show worse performance in the first time point overall and explains why this may be observed.
Significant differences between groups at the latter timepoint was also found both for aces and long rallies. Most notably the HCC group showed significantly fewer aces compared to CTL, RST, and IS groups. The mCC group also showed significantly fewer aces compared to RST and IS groups, however not the CTL group. In contrast, for long-rallies the mCC group showed significantly more than the CTL, RST and IS groups. Yet the hCC group only showed significantly more long-rallies compared to the IS group, but not RST or CTL. Moreover, here a significant negative correlation was found, suggesting that the performance was not arising out of a maladaptive behaviour such as fixing the paddle to a single corner. Wholistically, Figure 6F emphasises that although both mCC and hCC showed fewer aces and more long-rallies in the latter timepoints compared to the first, the cell types did display nuances in their behaviour, highlighting differences between cell types. The data also further suggests that an unstimulated culture still controlling the paddle will have significantly poorer performance than controls where the paddle is moved based on noise. This does suggest a systematic control that is difficult to interpret from this data but does indicate the potential of an enduring embodiment once stimulation ceases.
Biological neural networks require feedback for learning
To investigate the importance of feedback type for learning, cultures were tested under three conditions, for three days, with three sessions per day resulting in 483 sessions. Condition 1 (Stimulus) mimicked that used above, where predictable and unpredictable stimuli were administered when the cultures behaved desirably or not, respectively. Condition 2 (Silent) involved the stimulus feedback being replaced with a matching time-period in which all stimulation was withheld. Condition 3 (No-feedback) removed the restart after a miss. When the paddle did not successfully intercept the ball, the ball would bounce and continue without interruption: the stimulus reporting ball position was still provided. The difference between these conditions is emphasised in Figure 7A. Rest period activity was also gathered used to normalise performance on a per session basis to account for differences in unstimulated activity.
Stimulus and Silent conditions showed overall higher performance compared to Rest and No-feedback conditions. When testing for differences between groups in the percentage increase of average rally length over matched rest controls, a significant interaction was found. Only the Stimulus condition showed a significant increase in average rally length over time. While no differences were found for the first timepoint, a significant main effect of group was found at the second timepoint, where the Stimulus condition performed significantly higher than the Silent and No-feedback conditions. Interestingly, the Silent condition also significantly outperformed the No-feedback conditions, although with less magnitude. Importantly, this demonstrates that information alone is not sufficient; feedback is required to form a closed loop learning system. When followed up at the level of day for the second timepoint no significant differences over time were observed, but the between group differences were still observed. This trend was replicated when looking at aces both summed and across days of testing. For long rallies the Stimulus group at timepoint 1 showed significantly fewer long-rallies compared to the Silent and No-feedback condition, being reversed at timepoint 2 with the Stimulus group showing significant more long rallies compared to the No-feedback condition. No difference was found when this was followed up across day. We also demonstrate that this learning is not seen in electrically inactive non-neural cells. Collectively this data establishes that adaptive behaviour seen in cortical cells altering activity to manipulate the environment can be an emergent property of engaging with—and implicitly modelling—the environment.
Electrophysiological symmetry in latent activity is linked with higher performance
To determine whether spontaneous action potentials correlated with performance exploratory uncorrected Pearson’s correlations were computed for key activity metrics and average rally length in the last 15 minutes of gameplay. A significant positive correlation between mean firing and performance was found indicating a higher mean firing was associated with better performance, although max firing did not significantly correlate. This suggests that having well balanced higher activity was related to better performance, although the correlation was notably moderate. To further investigate whether the topographical distribution of activity correlated with performance, the absolute values of four discrete cosine transform (DCT) coefficients normalised to mean activity, was used to summarise spatial modes of spontaneous activity and assess symmetry of activity. DCT (0,2) which shows difference between activity on the lateral edges and the lateral centre was significantly negatively correlated with performance. However, DCT (0,1) which measures activity across the horizontal plane, DCT (1,0) which measures activity across the vertical plane, and DCT (2,0) which measures activity on the horizontal edges vs the horizontal centre, did not significantly correlate. These correlations indicate that symmetrical activity across cultures underwrites better performance, but max activity does not. Given the distribution of the motor regions and sensory information, this finding is very coherent, as if there are no active cells in an area to either record signal from or deliver stimulation too, it would result in a dysfunctional system.
Here we present a system, Dishbrain, which is capable of embodying neurons, from any source, in a virtual environment and measuring their responses to stimuli in real time. The ability of neurons, especially in assemblies, to respond to external stimuli in an adaptive manner is well established in vivo. However, this work is the first to establish this fundamental behaviour in vitro. We were able to use this silico-biological assay to investigate the fundamentals of neuronal computation. In brief, we demonstrate the first SBI device to show adaptive behaviour in real-time. The system itself offers opportunities to expand upon previous in silico models of neural behaviour, such as where models of hippocampal and entorhinal cells were tested in solving spatial and non-spatial problems (Sanders et al., 2020; Whittington et al., 2020). Minor variations on the DishBrain platform and selected cell type would enable an in vitro test to gain data around how cells process and compute information that was previously unattainable.
An example of this can be seen in the contrasting results between different cell sources. Active cortical cultures, from both human and mouse cell sources, both displayed synchronous activity patterns, in line with previous research. Yet importantly, significant differences between cell sources were observed as human cortical cells always outperformed mouse cortical cells with nuances in gameplay characteristics. Although further work is required, this is the first work finding empirical evidence supporting the hypothesis that human neurons have superior information-processing capacity over rodent neurons. This inherent difference between cell sources has been proposed to be due to denser and longer dendritic trees in human neurons, compared to mouse, which would yield different input-output properties and may thereby explain different computational capacities. It was not previously possible to separate the neuroanatomical structure of different species from the microscopic structure of neurons in terms of computational power. Our work demonstrates that even when all key features are kept constant (cell number, sensory input, motor output etc.), there are key differences between human and rodent cortical neurons. This provides the first empirical evidence of differences in computational power between neurons from different species offering an exciting avenue for future research. Another finding from this work relates to innate cell organisation, seen in the definition of motor regions. Previously, motor regions were mapped from a population coding approach that incorporated spatial information following a network activity scan5. While our early pilot studies were similar, we focused on the extent that self-organisation would adapt if motor regions were fixed between cultures. When the EXP3 algorithm was used we found that experimental cultures showed significant preferences for layouts that could leverage biological processes like lateral inhibition. This is consistent with past work that finds feedback between environment and action is required for proper neural development7. However, it further suggests that perhaps this development occurs based on properties inherent at the level of the cell. This system provides the opportunity to explore network dynamics to better understand this aspect of self-organisation. At the technical level, this system is readily adaptable to include investigations into structural organisation of neural networks in both a physical and computational sense.
Most significantly, this work represents a substantial technical advancement in creating closed-loop environments for BNNs. Here, we have emphasised the requirement for embodiment in neural systems for learning to occur. This is seen most significantly in the relative performance over experiments, where richer information and better feedback resulted in increased performance. Likewise, when no-feedback was provided yet information on ball position was available, cultures showed significantly poorer performance and no learning. Of particular interest was the finding that when stimulatory feedback was removed and replaced with silent feedback (i.e., the removal of all stimuli), cultures were still able to outperform those with no feedback as in the open-loop condition, albeit at a lesser extent. One interpretation is that playing ’pong’ generates more predictable outcomes than not playing ’pong’. Despite the outcome of a ‘failure’ not being unpredictable stimulation, given that the ball resets and the direction of the next movement is itself also unpredictable, this likely results in increased informational entropy, albeit to a far less extent. This is coherent with our results, as the more unpredictable an outcome, the greater the observed learning effect. However, the action of the BNN must have an outcome observable by the system. Therefore, it is coherent that the open-loop condition, which is by its nature the most predictable condition, did not result in learning. Stimulus alone is not sufficient to drive learning, there must be a motivation for the learning where altered behaviour can influence the future observable stimulus. When faced with unpredictable stimulus following unsuccessful performance, playing ’pong’ successfully acts as a free energy minimising solution. This offers a rather deflationary account of all goal-directed behaviour as the goal is just to minimise surprise. A key aspect of active inference is the selection of actions that minimise free energy expected following that action.
On this mechanistic level, we sought to demonstrate the utility of the DishBrain by testing base principles behind the idea of active inference via the FEP for intelligence, finding robust support for it. The closest previous work included studies of blind source separation in neural cultures. However, this study did not offer physiologically plausible training environments and the system effectively existed in an open-loop environment. This makes any interpretation that the system in these studies was operating under the FEP difficult as changes in the external environment was not related back to the internal system of the neuronal cultures. Our work here demonstrates that when supplying unpredictable (random) sensory input following an ’undesirable’ outcome—and providing predictable input following a ’desirable’ one, we were able to significantly shape the behaviour of neural cultures in real time. The predictable stimulation could also be read as a stabilising synaptic weights in line with previous research—or, in a complementary fashion, destabilising connectivity by destroying ’undesirable’ free energy minima. This may be a potential mechanism behind the FEP account of biological self-organisation, sometimes discussed in terms of autovitiation (i.e. self-organised instability by the destruction of self-induced but surprising fixed points of attraction). Crucially, expected free energy corresponds to uncertainty (i.e., informational entropy). This means that uncertainty minimising behaviour will have a natural curiosity, in the sense that it is necessarily information seeking. This is closely related to artificial curiosity in machine learning and intrinsic motivation in robotics.
Due to current hardware limitations, the sensory stimulation is magnitudes coarser compared to that for in vivo organisms. Additionally, it was infeasible to meaningfully implement mechanisms that would be crucial for an in vivo organism attempting a comparable task, such as proprioception. Moreover, the relatively small number of cells embedded in a monolayer format means the neural architecture driving this behaviour is incredibly simple, in terms of the number of possible connections available compared to even small organisms that have a 3D brain structure. Nonetheless, using only simple patterns of predictable and unpredictable stimulation, this system was able to shape behaviour in an order of minutes. While within session learning was well established, between session learning over multiple days was not observed so robustly. Cultures appeared to relearn associations, with each new session. Given that cortical cells were selected, this is to be expected. In vivo cortical cells are not known to be specialised for long-term memory. Future work with this system can investigate the use of other neuronal cell types and/or more complex biological structures.
Using this DishBrain system, we have demonstrated that a single layer of in vitro cortical neurons can self-organise and display intelligent and sentient behaviour when embodied in a simulated game-world. We have shown that even without a substantial filtering of cellular activity, statistically robust differences over time and against controls could be observed in the behaviour of neuronal cultures in adapting to goal directed tasks. These findings provide a convincing demonstration of the SBI based system to learn over time in a goal-orientated manner directed by input. The system provides the capability for a fully visualised model of learning, where unique environments may be developed to assess the actual computations being performed by BNNs. This is something long sought after and extends beyond purely in silico models or predictions of molecular pathways alone (Karr et al., 2012; Whittington et al., 2020; Yu et al., 2018). Therefore, this work provides empirical evidence which can be used to support or challenge theories explaining how the brain interacts with the world and intelligence in general. Ultimately, although substantial hardware, software, and wetware engineering is obviously still required to improve the DishBrain system, this work does evince the computational power of living neurons to learn adaptively in active exchange with their sensorium. This represents the largest step to date of achieving synthetic sentience capable of true generalised intelligence.
All experimental procedures were conducted in accordance with the Australian National Statement on Ethical Conduct in Human Research (2007) and the Australian Code for the Care and Use of Animals for scientific Purposes (2013) as required. Animal work was done under ethical approval E/1876/2019/M from the Alfred Research Alliance Animal Ethics Committee B. Experiments were performed at Monash University, Alfred Hospital Prescient with the appropriate personal and project licences and approvals. Work done using hiPSCs was done in keeping with the described material transfer agreement below.
No statistical methods were used to predetermine sample size. As all work was conducted within controlled environments uninfluenced by experimenter bias, experiments were not randomized, and investigators were not blinded to experimental condition. However, conditions were blinded before final analysis to prevent bias during analysis. Figure S5A presents a schematic of the overall experimental setup.
Animal Breeding and maintenance
BL6/C57 mice were mated at Monash Animal Research Platform (MARP). Upon confirmation of pregnancy animals were transported via an approved carrier to the Alfred Medical Research and Education Precinct (AMREP). Pregnant animals were housed in individually ventilated cages until the date when they were humanely killed, and primary cells were harvested.
Primary Cell Culturing
Cortical cells were disassociated from the cortices of E15.5 mouse embryos. Embryos were decapitated and with the aid of a stereotactic microscope the skin, bone and meninges were removed, and the anterior part of the cortex dissected out. Approximately 800,000 cells were plated down onto each pre-prepared HD-MEA. Cultures began to upregulate spontaneous activity and display synchronised firing around DIV 10 at which point they were used for experimentation.
Stem Cell Lines
Initial work was conducted using a control hiPSC line supplied by the Gene Editing Facility at the Murdoch Children’s Research Institute (ATCC® PCS-201-010) from an ATCC PCS-201-010 background and transferred under a Material Transfer Agreement. Later work involved an hiPSC lines used in this work constitutively expressing fluorescent reporters under control of the glyceraldehyde 3-phosphate dehydrogenase (GAPDH) promoter (cell lines were generated by Professor Edouard G. Stanley and colleagues from the Murdoch Children’s Research Institute and provided under a Material Transfer Agreement). The GAPDH gene encodes a protein critical in the glycolytic pathway, whereby ATP is synthesised from glucose. As this function is highly conserved across multiple cell types GAPDH is ubiquitously expressed at high levels across multiple cell types, making it a suitable gene for which to base a gene-expression system . This transgene expression system, termed GAPTrap, involves the insertion of the specific reporter gene into the GAPDH locus in hiPSCs using gene-editing technology. For this study, RM3.5 GT-GFP-01 constitutively expressing green fluorescent protein under the GAPDH promoter was utilised. The RM3.5 hiPSC line was initially derived from human foreskin fibroblasts and reprogrammed using the hSTEMCCAloxP four factor lentiviral vector as reported previously. All procedures described below were applied to be both cell lines. Both lines were maintained in an undifferentiated, pluripotent state in a feeder-free system using E8 media (StemCell Technologies, Canada) supplemented by a Penicillin/streptomycin solution at 5 µl/ml. Cells were plated on T25 353108 Blue Vented Falcon Flasks (Corning, Durham, USA) that were coated approximately 1 hr prior with the extracellular matrix vitronectin (Thermo Fisher Scientific, Carlsbad, USA).
Stem Cell Maintenance
All procedures were carried out using sterile techniques. Prior to passaging cell confluence was recorded and the required split ratio was determined. Media was aspirated from cells and cells were washed with 5 ml of PBS -/- before passaging to remove detached cells and other debris. 3 ml of a 0.05 µM EDTA in PBS -/- was used for the dissociation and passaging of hiPSCs as aggregates without manual selection or scraping, was added to cells, and allowed to incubate at 37°C for approximately 3.5 mins. After visual examination using 10X microscope indicated that cells had lost sufficient adhesion, EDTA was aspirated, and blunt trauma applied to base of the T25 flask to dislodge cells. Cells were suspended in 2 ml E8 and transferred to 15 ml falcon tube. As described above, vitronectin coated T25 flasks were prepared and aspirated before the addition of 5 ml of E8 solution. Approximately 1:10 of evenly distributed cell suspension was added to the prepared T25 flask. The flask was then gently swirled to ensure even distribution before being incubated overnight at 37°C. Media was changed daily.
Stem Cell Dual SMAD Differentiation
Cellular differentiation followed a titrated dual SMAD inhibition protocol for the generation of cortical cells from pluripotent cells established by the Livesey group with minor adjustments as represented in Figure S5B. Cells were plated in 24 well plates coated with human laminin H521. When cells reached ≈80% confluency, neural induction was initiated by using standard neural maintenance (N2B27) Base Media with 100 ng/ml LDN193189 (Stemcell Technologies Australia, Melbourne, Australia) and 10 µm SB431542 (Stemcell Technologies Australia, Melbourne, Australia). Media was changed every day from day 0 to day 12. After appearance of neural rosettes and initial passaging standard N2B27 media with FGF2 20 ng/ml was utilised from day 12 to day 17 to achieve a dorsal forebrain patterning. Cells were then expanded and deemed ready for plating onto MEA or slides based on morphology at approximately 30 – 33 days. On the day of transplant, cells were detached with Accutase (Stemcell Technologies Australia, Melbourne, Australia) to a single cell suspension and centrifuged at 300g. The cell pellet was resuspended at 10,000 cells/µl in BrainPhys (Stemcell Technologies Australia, Melbourne, Australia) neural maintenance media with Rho Kinase Inhibitor IV (Stemcell Technologies Australia, Melbourne, Australia;1:50 dilution) with approximately 106 cells plated onto each MEA. Cells began to display early but widespread spontaneous activity around DIV 80, at which point they were ready for experimentation.
Stem Cell NGN2 Direct Differentiation
Cortical excitatory neurons were generated by the expression of NGN2 in iPSCs. IPSCs were plated at 25,000 cells/cm2 in a 24-well plate coated with 15µg/ml human laminin (Sigma, USA). The following day, cells were transduced with NGN2 lentivirus (containing a tetracycline-controlled promoter coupled with a puromycin selection cassette) in combination with a lentivirus for the rtTA (reverse tetracycline-controlled transactivator). NGN2 gene expression was activated by the addition of 1 µg/ml doxycycline (Sigma, Australia), this was referred to as differentiation day 0. Cells were cultured in neural media consisting of 1:1 ratio of DMEM/F12:Neurobasal media supplemented with (all reagents from Thermofisher, USA) B27 (#17504-044), N2 (17502-048), Glutamax (#35050-060), NEAA (#11140-050), β-mercaptoethanol, ITS-A (#51300-044) and penicillin/streptomycin (#15140-122). On Day 1 1.0µg/ml puromycin (Sigma, Australia) was added for 3 days at which point neurons were supplemented with 10µg/ml BDNF (Peprotech, USA) and lifted with accutase, in preparation for plating on HD-MEA chips. HD-MEA chips were pre-treated with 100µg/ml PDL (Sigma, USA) and 15µg/ml laminin (Sigma, USA). For each well 1×105 NGN2 induced neurons at DD4 were combined with 2.5×104 primary human astrocytes (ScienceCell, USA) in each well of the MEA plate. To arrest cell division of astrocytes 2.5µM Ara-C hydrochloride (Sigma, USA) was added at day 5 for 48 hours. Cells were maintained in neural media supplemented with BDNF and media changed at least 1 day prior to recordings.
MEA setup and preparation
MaxOne Multielectrode Arrays (MEA; Maxwell Biosystems, AG, Switzerland) were used for this research. The MaxOne is a high-resolution electrophysiology platform featuring 26,000 platinum electrodes arranged over an 8 mm2. The MaxOne system is based on complementary meta-oxide-semiconductor (CMOS) technology and allows recording from up to 1024 channels and stimulation from up to 32 units. MEAs and chambered glass slides are coated with either polyethylenimine (PEI) in borate buffer for primary culture cells or Poly-D-Lysine for cells from an iPSC background before being coated with either 10 µg/ml mouse laminin or 5 µg/ml human 521 Laminin (Stemcell Technologies Australia, Melbourne, Australia) respectively to facilitate cell adhesion.
Plating and Maintaining Cells on MEA
Approximately 106 cells were plated on MEA after preparation via method already described. Cells were allowed approximately one hour to adhere to MEA surface before the well was flooded. The day after plating, cell culture media was changed to BrainPhys™ Neuronal Medium (Stemcell Technologies Australia, Melbourne, Australia) supplemented with 1% penicillin-streptomycin. Cultures were maintained in a low O2 incubator kept at 5% CO2, 5% O2, 36°C and 80% relative humidity. Every two days, half the media from each well was removed and replaced with free media. Media changes always occurred after all recording sessions.
Measuring of Electrophysiological Activity
Licenced MaxLab Live Scope V20.1 software was used to run activity scans. Checkerboard assays consisting of 14 configurations at 15 seconds of spike only record time were run daily immediately preceding the running of the DishBrain software. Gain was set to 512x with a 300 Hz high pass filter. Spike threshold was set to be a signal six sigma greater than background noise as per recommended software settings. Mean, max and variance of both amplitudes and firing rates was extracted from these assays and mapped using custom software: the first nine components of discrete cosine transform basis functions of space were used to summarise the spatial profile of spiking activity. The ensuing coefficients were then used in subsequent correlation analyses.
DishBrain software platform
The current DishBrain platforms is configured as a low-latency, real-time MEA control system with on-line spike detection and recording software. See Figure S3 and Supplementary Text 1. The DishBrain software runs at 20,000 Hz and allows recording at this incredibly fine timescale. Working closely with MaxWell Biosystems we enabled capabilities not available using the native vendor software. The existing API was used only for loading configurations. Low level code was written in C to allow for minimal processing latencies—so that packet processing latency was typically <50 µs. High level code, including configuration set ups and broader instructions for game settings were implemented in Python. Figure S5C shows an image of the game visualiser, and a real-time interactive version is available at https://spikestream.corticallabs.com/. This allowed a spike-to-stim latency of approximately 5 ms, with the substantive delay due to inflexible hardware buffering built into MaxOne hardware. Where appropriate, the EXP3 machine learning algorithm was used to sample two predefined motor regions to select the best configuration to interpret movement commands for the paddle. When the system failed to move the ‘paddle’ into a correct position (to contact the ball), a random stimulus was applied to culture at 5 Hz and 150 mV. After a 1 s delay—to allow the culture to recover—play was resumed. The online spike detection software was developed using an adaptive threshold-based detector. The threshold was typically set at 6 sigmas above noise estimates. We established this use a mean absolute deviation (MAD) estimate, which was multiplied by a correction factor. Infinite Impulse Response (IIR) filters and Digital Signal Processing (DSP) were used to prepare raw signals for spike detection.
Stimulation is delivered at a given Hz and voltage as described in the main text to key electrodes in a sensory area, as shown in Figure 4B. Initial experiments delivered purely place-coded stimulation, where the distance from the centre of the sensory area was interpreted as distance from the centre of the paddle aligning with the ball. As described in the main text, later experiments adopted a mixed coding scheme, where the place coding was combined with a rate coding that delivered stimuli at 4 Hz when the ball was closest to the opposing wall and increased to a max of 40 Hz as the ball reached the paddle wall.
Initially two predefined motor regions were defined on the MEA. Activity was measured over these two regions, where the region with higher activity would move the paddle in a corresponding direction. This was found to be extremely sensitive to culture characteristics, where asymmetrical spontaneous spiking activity in cultures would cause the paddle to move swiftly in only one direction. To counter this, a gain function was implemented, which measured activity in both regions and added a multiplier to a target of 20 Hz. Activity >20 Hz was weighted by a correction factor >1, while activity <20 Hz was weighted by a correction factor <1. An EXP3 algorithm was implemented to select the different configuration options illustrated in Figure S3. We found that experimental cultures preferred configuration 3, while media only control cultures preferred configuration 0. As such, configuration 3 was selected as it offered the possibility for biologically relevant features, such as lateral inhibition, and minimised the chance of apparently successful performance through bias alone—as it precludes a direct relationship between input stimulation and output activity recording.
An Exponential-weight algorithm for Exploration and Exploitation (EXP3) algorithm was used initially for the adaptive selection of electrode layouts, with the objective of optimising gameplay performance. This algorithm was implemented to maintain a list of weights for each action and was designed to minimise regret by preferencing electrode configurations which were associated with a higher probability of the ball being returned. This is described in detail in Supplementary Text 2.
Cells were washed three times with sterile PBS and then fixed using 4% PFA for 20 mins. After washing cells were blocked 0.3% Triton-X and 1% goat serum in PBS for 1 hr. Primary antibodies specific for Synapsin1 (1:500; ab254349; Rabbit; Abcam, Cambridge, MA, USA), NeuN (1:500; ab104225; Rabbit; Abcam, Cambridge, MA, USA), Beta-III Tubulin (1:500; MAB1637, Mouse; Kenilworth, NJ, USA), MAP2 (1:1000; Chicken; ab5392; Abcam, Cambridge, MA, USA), TBR1 (1:200; ab183032; Rabbit; Abcam, Cambridge, MA, USA), GFAP (1:500; ab4674; Chicken; Abcam, Cambridge, MA, USA), and KI67 (1:500; ab245113; Mouse; Abcam, Cambridge, MA, USA) were applied overnight. After washing, secondary antibodies (chicken 555, rabbit 488, mouse 647; Abcam, Cambridge, MA, USA) were incubated for 2 hrs. This was followed by 10 mins of DAPI Staining Solution in PBS (1:1000, ab228549, Abcam, Cambridge, MA, USA) after which point slides were cover-slipped with ProLong Gold Antifade Mountant (Thermo Fisher Scientific, Waltham, MA, USA) mounting media and allowed to dry for 48 hrs. Scanning Electron Microscopy. At various designated endpoints, media was aspirated from the MEA wells and cells were fixed with 2.5% glutaraldehyde (Electron Microscopy Sciences, PA, USA) and 2% paraformaldehyde (Electron Microscopy Sciences, PA, USA) in a 1 M sodium cacodylate buffer for 1 hr. They were then washed three times in 1 M sodium cacodylate buffer before being post-fixed with 1% OsO4 in a 1M sodium cacodylate buffer for 1 hr. OsO4 was removed and the fixed cells were washed with three times in milliQ water and dehydrated via an ethanol gradient exchange (30%, 50%, 70%, 90%, 100%, 100% v/v) for 15 mins each. After dehydration, the cells were dried by hexamethyldisilazane (Sigma Aldrich, St. Louis, MO, USA) exchange (3×10 mins), and then allowed to evaporate for 5-10 mins. MEA chips were then affixed to an aluminium stub with carbon tape and sputter coated with 30 nm layer of gold using a BAL-TEC SCD-005 gold sputter coated. All procedures were performed at room temperature. Coated MEA chips were then imaged using a FEI Nova NanoSEM 450 FEGSEM operating with an acceleration voltage of 10 kV and a working distance of 12 mm. Images were then analysed using ImageJ v. 1.52k and false coloured using Adobe Photoshop.
Widefield fluorescence microscopy
Images were captured using a Nikon Ti-E upright light microscope equipped with a motorised stage. All widefield images were captured using a 20X objective.
Data was analysed using custom code written in Python. Error bars are described in captions, except where graphs are box and whisker plots, where the line is the median, box indicates lower quartile to upper quartile and error bars show the rest of the distribution excluding outliers. The illustrative data provided in the text and figures include means and standard deviations. An alpha of p < 0.05 was adopted to establish statistical significance, providing a 5% chance of a false positive error. Where suitable assumptions were met, inferential frequentist statistics were used to determine whether statistically significant differences existed between groups. All tests were two tailed tests for statistical significance. For related samples t-tests or independent T-tests alpha values for significance were corrected via the Bonferroni method. For one-way analysis of variance (ANOVA) and the multivariate 2 x 3 repeated measures ANOVA when a significant interaction or main effect was found, this was followed up with pairwise Games-Howell post hoc tests with Tukey correction for multiple comparisons. This was adopted as = there were always differences between sample sizes and variance due to inclusion of in-silico controls. As seen in Figure S5D, four DCT basis functions were used to summarise spatial modes of spontaneous activity. Pairwise Pearson’s correlations were used to test the relationship between the ensuing scores—along with time (s) and max and mean firing rates (Hz)—with average rally length.