ART

LOADING PUBLICATIONS

Nutrient concentrations in food display universal behaviour

Giulia Menichetti and Albert-László Barabási

Nature Food volume 3, pages 375–382 (2022)

Extensive programmes around the world endeavour to measure and catalogue the composition of food. Here we analyse the
nutrient content of the full US food supply and show that the concentration of each nutrient follows a universal single-parameter
scaling law that accurately captures the eight orders of magnitude in nutrient content variability. We show that the universality
is rooted in the biochemical constraints obeyed by the metabolic pathways responsible for nutrient modulation, allowing us to
confirm the empirically observed scaling law and to predict its variability in agreement with the data. We propose that the natu-
ral nutrient variability in food can be quantitatively formalized. This provides a mathematical rationale for imputing missing
values in food composition databases and paves the way towards a quantitative understanding of the impact of food processing
on nutrient balance and health effects.

Dynamics of ranking

Gerardo Iñiguez, Carlos Pineda, Carlos Gershenson, & Albert-László Barabási

Nature Communications volume 13, Article number: 1646 (2022)

Virtually anything can be and is ranked; people, institutions, countries, words, genes. Rankings reduce complex systems to ordered lists, reflecting the ability of their elements to perform relevant functions, and are being used from socioeconomic policy to knowledge extraction. A century of research has found regularities when temporal rank data is aggregated. Far less is known, however, about how rankings change in time. Here we explore the dynamics of 30 rankings in natural, social, economic, and infrastructural systems, comprising millions of elements and timescales from minutes to centuries. We find that the flux of new elements determines the stability of a ranking: for high flux only the top of the list is stable, otherwise top and bottom are equally stable. We show that two basic mechanisms — displacement and replacement of elements — capture empirical ranking dynamics. The model uncovers two regimes of behavior; fast and large rank changes, or slow diffusion. Our results indicate that the balance between robustness and adaptability in ranked systems might be governed by simple random processes irrespective of system details.

Recovery coupling in multilayer networks

Michael M. Danziger & Albert-László Barabási

Nature Communications volume 13, Article number: 955 (2022)

The increased complexity of infrastructure systems has resulted in critical interdependencies between multiple networks—communication systems require electricity, while the normal functioning of the power grid relies on communication systems. These interdependencies have inspired an extensive literature on coupled multilayer networks, assuming a hard interdependence, where a component failure in one network causes failures in the other network, resulting in a cascade of failures across multiple systems. While empirical evidence of such hard failures is limited, the repair and recovery of a network requires resources typically supplied by other networks, resulting in documented interdependencies induced by the recovery process. In this work, we explore recovery coupling, capturing the dependence of the recovery of one system on the instantaneous functional state of another system. If the support networks are not functional, recovery will be slowed. Here we collected data on the recovery time of millions of power grid failures, finding evidence of universal nonlinear behavior in recovery following large perturbations. We develop a theoretical framework to address recovery coupling, predicting quantitative signatures different from the multilayer cascading failures. We then rely on controlled natural experiments to separate the role of recovery coupling from other effects like resource limitations, offering direct evidence of how recovery coupling affects a system’s functionality.

Quantifying NFT‑driven networks in crypto art

Kishore Vasan, Milán Janosov & Albert‑László Barabási

Scientific Reports volume 12, Article number: 2769 (2022)

The evolution of the art ecosystem is driven by largely invisible networks, defined by undocumented interactions between artists, institutions, collectors and curators. The emergence of cryptoart, and the NFT-based digital marketplace around it, offers unprecedented opportunities to examine the mechanisms that shape the evolution of networks that define artistic practice. Here we mapped the Foundation platform, identifying over 48,000 artworks through the associated NFTs listed by over 15,000 artists, allowing us to characterize the patterns that govern the networks that shape artistic success. We find that NFT adoption by both artists and collectors has undergone major changes, starting with a rapid growth that peaked in March 2021 and the emergence of a new equilibrium in June. Despite significant changes in activity, the average price of the sold art remained largely unchanged, with the price of an artist’s work fluctuating in a range that determines his or her reputation. The artist invitation network offers evidence of rich and poor artist clusters, driven by homophily, indicating that the newly invited artists develop similar engagement and sales patterns as the artist who invited them. We find that successful artists receive disproportional, repeated investment from a small group of collectors, underscoring the importance of artist–collector ties in the digital marketplace. These reproducible patterns allow us to characterize the features, mechanisms, and the networks enabling the success of individual artists, a quantification necessary to better understand the emerging NFT ecosystem.

Network medicine framework for identifying drug-repurposing opportunities for COVID-19

Deisy Morselli Gysi, Ítalo do Valle, Marinka Zitnik, Asher Ameli, Xiao Gan, Onur Varol, Susan Dina Ghiassian, J. J. Patten, Robert A. Davey, Joseph Loscalzo, and Albert-László Barabási

PNAS May 11, 2021 118 (19) e2025581118

The COVID-19 pandemic has highlighted the need to quickly and reliably prioritize clinically approved compounds for their potential effectiveness for severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) infections. Here, we deployed algorithms relying on artificial intelligence, network diffusion, and network proximity, tasking each of them to rank 6,340 drugs for their expected efficacy against SARS-CoV-2. To test the predictions, we used as ground truth 918 drugs experimentally screened in VeroE6 cells, as well as the list of drugs in clinical trials that capture the medical community’s assessment of drugs with potential COVID-19 efficacy. We find that no single predictive algorithm offers consistently reliable outcomes across all datasets and metrics. This outcome prompted us to develop a multimodal technology that fuses the predictions of all algorithms, finding that a consensus among the different predictive methods consistently exceeds the performance of the best individual pipelines. We screened in human cells the top-ranked drugs, obtaining a 62% success rate, in contrast to the 0.8% hit rate of nonguided screenings. Of the six drugs that reduced viral infection, four could be directly repurposed to treat COVID-19, proposing novel treatments for COVID-19. We also found that 76 of the 77 drugs that successfully reduced viral infection do not bind the proteins targeted by SARS-CoV-2, indicating that these network drugs rely on network-based mechanisms that cannot be identified using docking-based strategies. These advances offer a methodological pathway to identify repurposable drugs for future pathogens and neglected diseases underserved by the costs and extended timeline of de novo drug development.

Network medicine framework shows that proximity of polyphenol targets and disease proteins predicts therapeutic effects of polyphenols

Italo F. do Valle, Harvey G. Roweth, Michael W. Malloy, Sofia Moco, Denis Barron, Elisabeth Battinelli, Joseph Loscalzo & Albert-László Barabási

Nature Food volume 2, pages143–155(2021)

Polyphenols, natural products present in plant-based foods, play a protective role against several complex diseases through their antioxidant activity and by diverse molecular mechanisms. Here we develop a network medicine framework to uncover mechanisms for the effects of polyphenols on health by considering the molecular interactions between polyphenol protein targets and proteins associated with diseases. We find that the protein targets of polyphenols cluster in specific neighbourhoods of the human interactome, whose network proximity to disease proteins is predictive of the molecule’s known therapeutic effects. The methodology recovers known associations, such as the effect of epigallocatechin-3-O-gallate on type 2 diabetes, and predicts that rosmarinic acid has a direct impact on platelet function, representing a novel mechanism through which it could affect cardiovascular health. We experimentally confirm that rosmarinic acid inhibits platelet aggregation and α-granule secretion through inhibition of protein tyrosine phosphorylation, offering direct support for the predicted molecular mechanism. Our framework represents a starting point for mechanistic interpretation of the health effects underlying food-related compounds, allowing us to integrate into a predictive framework knowledge on food metabolism, bioavailability and drug interaction.

A wealth of discovery built on the Human Genome Project — by the numbers

Alexander J. Gates, Deisy Morselli Gysi, Manolis Kellis & Albert-László Barabási

Nature 590, 212-215 (2021)

The 20th anniversary of the publication of the first draft of the human genome offers an opportunity to track how the project has empowered research into the genetic roots of human disease, changed drug discovery and helped to revise the idea of the gene itself.

Here we distill these impacts and trends. We combined several data sets to quantify the different types of genetic element that have been discovered and that generated publications, and how the pattern of discovery and publishing has changed over the years. Our analysis linked together data including RNA transcripts; around 1 million single nucleotide polymorphisms (SNPs); human diseases with documented genetic roots; approved and experimental pharmaceuticals; and scientific publications between 1900 and 2017.

Social network structure and composition in former NFL football players

Amar Dhand, Liam McCafferty, Rachel Grashow, Ian M. Corbin, Sarah Cohan, Alicia J. Whittington, Ann Connor, Aaron Baggish, Mark Weisskopf, Ross Zafonte, Alvaro Pascual-Leone & Albert-László Barabási

Scientific Reports volume 11, Article number: 1630 (2021)

Social networks have broad effects on health and quality of life. Biopsychosocial factors may also modify the effects of brain trauma on clinical and pathological outcomes. However, social network characterization is missing in studies of contact sports athletes. Here, we characterized the personal social networks of former National Football League players compared to non-football US males. In 303 former football players and 269 US males, we found that network structure (e.g., network size) did not differ, but network composition (e.g., proportion of family versus friends) did differ. Football players had more men than women, and more friends than family in their networks compared to US males. Black players had more racially diverse networks than White players and US males. These results are unexpected because brain trauma and chronic illnesses typically cause diminished social relationships. We anticipate our study will inform more multi-dimensional study of, and treatment options for, contact sports athletes. For example, the strong allegiances of former athletes may be harnessed in the form of social network interventions after brain trauma. Because preserving health of contact sports athletes is a major goal, the study of social networks is critical to the design of future research and treatment trials.

Uncovering the genetic blueprint of the C. elegans nervous system

István A. Kovács, Dániel L. Barabási, and Albert-László Barabási

PNAS December 29, 2020 117 (52) 33570-33577

A fundamental question of neuroscience is how the brain wires itself. Here, we propose a modeling framework that explains how cellular connectivity emerges from neuronal identity, allowing us to offer experimentally falsifiable predictions on the genetic encoding of the connectome. The rapid advances in brain science require quantitative frameworks to integrate genetic and connectome information. The proposed model responds to this need, helping us unveil the genetically driven mechanisms that govern the formation of individual links in the brain.

A systematic comprehensive longitudinal evaluation of dietary factors associated with acute myocardial infarction and fatal coronary heart disease

Soodabeh Milanlouei, Giulia Menichetti, Yanping Li, Joseph Loscalzo, Walter C. Willett & Albert-László Barabási

Nature Communications volume 11, Article number: 6074 (2020)

Environmental factors, and in particular diet, are known to play a key role in the development of Coronary Heart Disease. Many of these factors were unveiled by detailed nutritional epidemiology studies, focusing on the role of a single nutrient or food at a time. Here, we apply an Environment-Wide Association Study approach to Nurses’ Health Study data to explore comprehensively and agnostically the association of 257 nutrients and 117 foods with coronary heart disease risk (acute myocardial infarction and fatal coronary heart disease). After accounting for multiple testing, we identify 16 food items and 37 nutrients that show statistically significant association – while adjusting for potential confounding and control variables such as physical activity, smoking, calorie intake, and medication use – among which 38 associations were validated in Nurses’ Health Study II. Our implementation of Environment-Wide Association Study successfully reproduces prior knowledge of diet-coronary heart disease associations in the epidemiological literature, and helps us detect new associations that were only marginally studied, opening potential avenues for further extensive experimental validation. We also show that Environment-Wide Association Study allows us to identify a bipartite food-nutrient network, highlighting which foods drive the associations of specific nutrients with coronary heart disease risk.

Isotopy and energy of physical networks

Yanchen Liu, Nima Dehmamy & Albert-László Barabási

Nature Physics (2020)

While the structural characteristics of a network are uniquely determined by its adjacency matrix, in physical networks, such as the brain or the vascular system, the network’s three-dimensional layout also affects the system’s structure and function. We lack, however, the tools to distinguish physical networks with identical wiring but different geometrical layouts. To address this need, here we introduce the concept of network isotopy, representing different network layouts that can be transformed into one another without link crossings, and show that a single quantity, the graph linking number, captures the entangledness of a layout, defining distinct isotopy classes. We find that a network’s elastic energy depends linearly on the graph linking number, indicating that each local tangle offers an independent contribution to the total energy. This finding allows us to formulate a statistical model for the formation of tangles in physical networks. We apply the developed framework to a diverse set of real physical networks, finding that the mouse connectome is more entangled than expected based on optimal wiring.

Exploring food contents in scientific literature with foodMine

Forrest Hooton, Giulia Menichetti & Albert‐László Barabási

Scientific Reports volume 10, Article number: 16191 (2020)

Thanks to the many chemical and nutritional components it carries, diet critically affects human health. However, the currently available comprehensive databases on food composition cover only a tiny fraction of the total number of chemicals present in our food, focusing on the nutritional components essential for our health. indeed, thousands of other molecules, many of which have well documented health implications, remain untracked. to explore the body of knowledge available on food composition, we built foodMine, an algorithm that uses natural language processing to identify papers from pubMed that potentially report on the chemical composition of garlic and cocoa. After extracting from each paper information on the reported quantities of chemicals, we find that the scientific literature carries extensive information on the detailed chemical components of food that is currently not integrated in databases. finally, we use unsupervised machine learning to create chemical embeddings, finding that the chemicals identified by FoodMine tend to have direct health relevance, reflecting the scientific community’s focus on health-related chemicals in our food.

Science, advocacy, and quackery in nutritional books: an analysis of conflicting advice and purported claims of nutritional best-sellers

Rebecca M. Marton, Xindi Wang, Albert-László Barabási & John P. A. Ioannidis

Palgrave Communications volume 6, Article number: 43 (2020)

Nutritional decisions may be important for health, and yet identifying trustworthy sources of advice can be difficult to achieve. Many people turn to books for nutritional advice, making the contents of these books and the expertise of their authors relevant to public health. Here, the top 100 best-selling books were identified and assessed for both the claims they make in their summaries and the credentials of the authors. Weight loss was a common theme in the summaries of nutritional best-selling books. In addition to weight loss, 31 of the books promised to cure or prevent a host of diseases, including diabetes, heart disease, cancer, and dementia; however, the nutritional advice given to achieve these outcomes varied widely in terms of which types of foods should be consumed or avoided and this information was often contradictory between books. Recommendations regarding the consumption of carbohydrates, dairy, proteins, and fat in particular differed greatly between books. To determine the qualifications of each author in making nutritional claims, the highest earned degree and listed occupations of each author was researched and analyzed. Out of 83 unique authors, 33 had an M.D. or Ph.D degree. Twenty-eight of the authors were physicians, three were dietitians, and other authors held a wide range of jobs, including personal trainers, bloggers, and actors. Of 20 authors who had or claimed university affiliations, seven had a current university appointment that could be verified online in university directories. This study illuminates the range of the incongruous information being dispersed to the public and emphasizes the need for future efforts to improve the dissemination of sound nutritional advice.

Historical comparison of gender inequality in scientific careers across countries and disciplines

Junming Huang, Alexander J. Gates, Roberta Sinatra, and Albert-László Barabási

PNAS March 3, 2020 117 (9) 4609-4616

There is extensive, yet fragmented, evidence of gender differences in academia suggesting that women are underrepresented in most scientific disciplines and publish fewer articles throughout a career, and their work acquires fewer citations. Here, we offer a comprehensive picture of longitudinal gender differences in performance through a bibliometric analysis of academic publishing careers by reconstructing the complete publication history of over 1.5 million gender-identified authors whose publishing career ended between 1955 and 2010, covering 83 countries and 13 disciplines. We find that, paradoxically, the increase of participation of women in science over the past 60 years was accompanied by an increase of gender differences in both productivity and impact. Most surprisingly, though, we uncover two gender invariants, finding that men and women publish at a comparable annual rate and have equivalent career-wise impact for the same size body of work. Finally, we demonstrate that differences in publishing career lengths and dropout rates explain a large portion of the reported career-wise differences in productivity and impact, although productivity differences still remain. This comprehensive picture of gender inequality in academia can help rephrase the conversation around the sustainability of women’s careers in academia, with important consequences for institutions and policy makers.

The exposome and health: Where chemistry meets biology

Roel Vermeulen, Emma L. Schymanski, Albert-László Barabási, Gary W. Miller

Science 24 Jan 2020: 367, 6476, 392-396

Despite extensive evidence showing that exposure to specific chemicals can lead to disease, current research approaches and regulatory policies fail to address the chemical complexity of our world. To safeguard current and future generations from the increasing number of chemicals polluting our environment, a systematic and agnostic approach is needed. The “exposome” concept strives to capture the diversity and range of exposures to synthetic chemicals, dietary constituents, psychosocial stressors, and physical factors, as well as their corresponding biological responses. Technological advances such as high-resolution mass spectrometry and network science have allowed us to take the first steps toward a comprehensive assessment of the exposome. Given the increased recognition of the dominant role that nongenetic factors play in disease, an effort to characterize the exposome at a scale comparable to that of the human genome is warranted.

Understanding the Representation Power of Graph Neural Networks in Learning Graph Topology

Nima Dehmamy, Albert-László Barabási, Rose Yu

NeurIPS 32 2019

To deepen our understanding of graph neural networks, we investigate the representation power of Graph Convolutional Networks (GCN) through the looking glass of graph moments, a key property of graph topology encoding path of various lengths. We find that GCNs are rather restrictive in learning graph moments. Without careful design, GCNs can fail miserably even with multiple layers and nonlinear activation functions. We analyze theoretically the expressiveness of GCNs, concluding that a modular GCN design, using different propagation rules with residual connections could significantly improve the performance of GCN. We demonstrate that such modular designs are capable of distinguishing graphs from different graph generation models for surprisingly small graphs, a notoriously difficult problem in network science. Our investigation suggests that, depth is muchmore influential than width, with deeper GCNs being more capable of learning higher order graph moments. Additionally, combining GCN modules with different propagation rules is critical to the representation power of GCNs.

The unmapped chemical complexity of our diet

Albert-László Barabási, Giulia Menichetti & Joseph Loscalzo

Nature Food 1, 33-37 (2019)

Our understanding of how diet affects health is limited to 150 key nutritional components that are tracked and catalogued by the United States Department of Agriculture and other national databases. Although this knowledge has been transformative for health sciences, helping unveil the role of calories, sugar, fat, vitamins and other nutritional factors in the emergence of common diseases, these nutritional components represent only a small fraction of the more than 26,000 distinct, definable biochemicals present in our food—many of which have documented effects on health but remain unquantified in any systematic fashion across different individual foods. Using new advances such as machine learning, a high-resolution library of these biochemicals could enable the systematic study of the full biochemical spectrum of our diets, opening new avenues for understanding the composition of what we eat, and how it affects health and disease.

A Genetic Model of the Connectome

Dániel L. Barabási, Albert-László Barabási

Neuron 105, 1-11 2019

The connectomes of organisms of the same species show remarkable architectural and often local wiring similarity, raising the question: where and how is neuronal connectivity encoded? Here, we start from the hypothesis that the genetic identity of neurons guides synapse and gap-junction formation and show that such genetically driven wiring predicts the existence of specific biclique motifs in the connectome. We identify a family of large, statistically significant biclique subgraphs in the connectomes of three species and show that within many of the observed bicliques the neurons share statistically significant expression patterns and morphological characteristics, supporting our expectation of common genetic factors that drive the synapse formation within these subgraphs. The proposed connectome model offers a self-consistent framework to link the genetics of an organism to the reproducible architecture of its connectome, offering experimentally falsifiable predictions on the genetic factors that drive the formation of individual neuronal circuits.

Synthetic ablations in the C. elegans nervous system

Emma K. Towlson and Albert-László Barabási

Network Neuroscience 2020, pp. 1–17

Synthetic lethality, the finding that the simultaneous knockout of two or more individually nonessential genes leads to cell or organism death, has offered a systematic framework to explore cellular function, and also offered therapeutic applications. Yet the concept lacks its parallel in neuroscience—a systematic knowledge base on the role of double or higher order ablations in the functioning of a neural system. Here, we use the framework of network control to systematically predict the effects of ablating neuron pairs and triplets on the gentle touch response. We find that surprisingly small sets of 58 pairs and 46 triplets can reduce muscle controllability in this context, and that these sets are localized in the nervous system in distinct groups. Further, they lead to highly specific experimentally testable predictions about mechanisms of loss of control, and which muscle cells are expected to experience this loss.

Nature’s reach: narrow work has broad impact

Alexander J. Gates, Qing Ke, Onur Varol & Albert-László Barabási

Nature 575, 32-34 (2019)

How knowledge informs and alters disciplines is itself an enlightening, and vibrant field. This type of meta research into new findings, insights, conceptual frameworks and techniques is important, among other things, for policymakers who fund research in the hope of tackling society’s most pressing challenges, which inevitably span disciplines.

Since its founding in 1869, Nature has offered a venue for publishing major advances from many fields. To mark its anniversary, we track here how papers cite and are cited across disciplines, using data on tens of millions of scientific articles indexed in Clarivate Analytics’ Web of Science (WoS), a bibliometric database that encompasses many thousands of research journals starting from 1900. We pay particular attention to articles that appeared in Nature. In our view, this snapshot, for all its idiosyncrasies, reveals how scientific work is ever more becoming a mixture of disciplines.

Success in books: predicting book sales before publication

Xindi Wang, Burcu Yucesoy, Onur Varol, Tina Eliassi-Rad, Albert-László Barabási

EPJ Data Science 8: 31 (2019)

Reading remains a preferred leisure activity fueling an exceptionally competitive publishing market: among more than three million books published each year, only a tiny fraction are read widely. It is largely unpredictable, however, which book will that be, and how many copies it will sell. Here we aim to unveil the features that affect the success of books by predicting a book’s sales prior to its publication. We do so by employing the Learning to Place machine learning approach, that can predicts sales for both fiction and nonfiction books as well as explaining the predictions by comparing and contrasting each book with similar ones. We analyze features contributing to the success of a book by feature importance analysis, finding that a strong driving factor of book sales across all genres is the publishing house. We also uncover differences between genres: for thrillers and mystery, the publishing history of an author (as measured by previous book sales) is highly important, while in literary fiction and religion, the author’s visibility plays a more central role. These observations provide insights into the driving forces behind success within the current publishing industry, as well as how individuals choose what books to read.

Network-based prediction of protein interactions

István A. Kovács, Katja Luck, Kerstin Spirohn, Yang Wang, Carl Pollis, Sadie Schlabach, Wenting Bian, Dae-Kyum Kim, Nishka Kishore, Tong Hao, Michael A. Calderwood, Marc Vidal & Albert-László Barabási

Nature Communications 10, Article number: 1240 (2019)

Despite exceptional experimental efforts to map out the human interactome, the continued data incompleteness limits our ability to understand the molecular roots of human disease. Computational tools offer a promising alternative, helping identify biologically significant, yet unmapped protein-protein interactions (PPIs). While link prediction methods connect proteins on the basis of biological or network-based similarity, interacting proteins are not necessarily similar and similar proteins do not necessarily interact. Here, we offer structural and evolutionary evidence that proteins interact not if they are similar to each other, but if one of them is similar to the other’s partners. This approach, that mathematically relies on network paths of length three (L3), significantly outperforms all existing link prediction methods. Given its high accuracy, we show that L3 can offer mechanistic insights into disease mechanisms and can complement future experimental efforts to complete the human interactome.

Network-based prediction of drug combinations

Feixiong Chen, István A. Kovács & Albert László Barabási

Nature Communications 10, Article number: 1197 (2019)

Drug combinations, offering increased therapeutic efficacy and reduced toxicity, play an important role in treating multiple complex diseases. Yet, our ability to identify and validate effective combinations is limited by a combinatorial explosion, driven by both the large number of drug pairs as well as dosage combinations. Here we propose a network-based methodology to identify clinically efficacious drug combinations for specific diseases. By quantifying the network-based relationship between drug targets and disease proteins in the human protein–protein interactome, we show the existence of six distinct classes of drug–drug–disease combinations. Relying on approved drug combinations for hypertension and cancer, we find that only one of the six classes correlates with therapeutic effects: if the targets of the drugs both hit disease module, but target separate neighborhoods. This finding allows us to identify and validate antihypertensive combinations, offering a generic, powerful network methodology to identify efficacious combination therapies in drug development.

Taking Census of Physics

Federico Battiston, Federico Musciotto, Dashun Wang, Albert-László Barabási, Michael Szell, and Roberta Sinatra

Nature Reviews Physics 1, 89-97 (2019)

Over the past decades, the diversity of areas explored by physicists has exploded, encompassing new topics from biophysics and chemical physics to network science. However, it is unclear how these new subfields emerged from the traditional subject areas and how physicists explore them. To map out the evolution of physics subfields, here, we take an intellectual census of physics by studying physicists’ careers. We use a large-scale publication data set, identify the subfields of 135,877 physicists and quantify their heterogeneous birth, growth and migration patterns among research areas. We find that the majority of physicists began their careers in only three subfields, branching out to other areas at later career stages, with different rates and transition times. Furthermore, we analyse the productivity, impact and team sizes across different subfields, finding drastic changes attributable to the recent rise in large-scale collaborations. This detailed, longitudinal census of physics can inform resource allocation policies and provide students, editors and scientists with a broader view of the field’s internal dynamics.

The Chaperone Effect in Scientific Publishing

Vedran Sekara, Pierre Deville, Sebastian E. Ahnert, Albert-László Barabási, Roberta Sinatra, and Sune Lehmann

PNAS 115:50, 12603-12607 (2018)

Experience plays a critical role in crafting high-impact scientific work. This is particularly evident in top multidisciplinary journals, where a scientist is unlikely to appear as senior author if he or she has not previously published within the same journal. Here, we develop a quantitative understanding of author order by quantifying this “chaperone effect,” capturing how scientists transition into senior status within a particular publication venue. We illustrate that the chaperone effect has a different magnitude for journals in different branches of science, being more pronounced in medical and biological sciences and weaker in natural sciences. Finally, we show that in the case of high-impact venues, the chaperone effect has significant implications, specifically resulting in a higher average impact relative to papers authored by new principal investigators (PIs). Our findings shed light on the role played by experience in publishing within specific scientific journals, on the paths toward acquiring the necessary experience and expertise, and on the skills required to publish in prestigious venues.

The Universal Decay of Collective Memory and Attention

Cristian Candia, C. Jara-Figueroa, Carlos Rodriguez-Sickert, Albert-László Barabási, and César A. Hidalgo

Nature Human Behavior 3, 82–91 (2019)

Collective memory and attention are sustained by two channels: oral communication (communicative memory) and the physical recording of information (cultural memory). Here, we use data on the citation of academic articles and patents, and on the online attention received by songs, movies and biographies, to describe the temporal decay of the attention received by cultural products. We show that, once we isolate the temporal dimension of the decay, the attention received by cultural products decays following a universal biexponential function. We explain this universality by proposing a mathematical model based on communicative and cultural memory, which fits the data better than previously proposed log-normal and exponential models. Our results reveal that biographies remain in our communicative memory the longest (20–30 years) and music the shortest (about 5.6 years). These findings show that the average attention received by cultural products decays following a universal biexponential function.

A Structural Transition in Physical Networks

Nima Dehmamy, Soodabeh Milanlouei & Albert-László Barabási

Nature 563, pages676–680 (2018)

In many physical networks, including neurons in the brain three-dimensional integrated circuits and underground hyphal networks, the nodes and links are physical objects that cannot intersect or overlap with each other. To take this into account, non-crossing conditions can be imposed to constrain the geometry of networks, which consequently affects how they form, evolve and function. However, these constraints are not included in the theoretical frameworks that are currently used to characterize real networks. Most tools for laying out networks are variants of the force-directed layout algorithm—which assumes dimensionless nodes and links—and are therefore unable to reveal the geometry of densely packed physical networks. Here we develop a modelling framework that accounts for the physical sizes of nodes and links, allowing us to explore how non-crossing conditions affect the geometry of a network. For small link thicknesses, we observe a weakly interacting regime in which link crossings are avoided via local link rearrangements, without altering the overall geometry of the layout compared to the force-directed layout. Once the link thickness exceeds a threshold, a strongly interacting regime emerges in which multiple geometric quantities, such as the total link length and the link curvature, scale with the link thickness. We show that the crossover between the two regimes is driven by the non-crossing condition, which allows us to derive the transition point analytically and show that networks with large numbers of nodes will ultimately exist in the strongly interacting regime. We also find that networks in the weakly interacting regime display a solid-like response to stress, whereas in the strongly interacting regime they behave in a gel-like fashion. Networks in the weakly interacting regime are amenable to 3D printing and so can be used to visualize network geometry, and the strongly interacting regime provides insights into the scaling of the sizes of densely packed mammalian brains.

Quantifying Reputation and Success in Art

Samuel P. Fraiberger, Roberta Sinatra, Magnus Resch, Christoph Riedl, Albert-László Barabási

Science 08 Nov 2018: eaau7224 DOI: 10.1126/science.aau7224

In areas of human activity where performance is difficult to quantify in an objective fashion, reputation and networks of influence play a key role in determining access to resources and rewards. To understand the role of these factors, we reconstructed the exhibition history of half a million artists, mapping out the coexhibition network that captures the movement of art between institutions. Centrality within this network captured institutional prestige, allowing us to explore the career trajectory of individual artists in terms of access to coveted institutions. Early access to prestigious central institutions offered life-long access to high-prestige venues and reduced dropout rate. By contrast, starting at the network periphery resulted in a high dropout rate, limiting access to central institutions. A Markov model predicts the career trajectory of individual artists and documents the strong path and history dependence of valuation in art.

Functional Structures for US state governments

Stephen Kosack, Michele Coscia, Evann Smith, Kim Albrecht, Albert-László Barabási, and Ricardo Hausmann

Proceedings of the National Academy of Sciences Oct 2018, 201803228; DOI: 10.1073/pnas.1803228115

  • ABSTRACT

Governments in modern societies undertake an array of complex functions that shape politics and economics, individual and group behavior, and the natural, social, and built environment. How are governments structured to execute these diverse responsibilities? How do those structures vary, and what explains the differences? To examine these longstanding questions, we develop a technique for mapping Internet “footprint” of government with network science methods. We use this approach to describe and analyze the diversity in functional scale and structure among the 50 US state governments reflected in the webpages and links they have created online: 32.5 million webpages and 110 million hyperlinks among 47,631 agencies. We first verify that this extensive online footprint systematically reflects known characteristics: 50 hierarchically organized networks of state agencies that scale with population and are specialized around easily identifiable functions in accordance with legal mandates. We also find that the footprint reflects extensive diversity among these state functional hierarchies. We hypothesize that this variation should reflect, among other factors, state income, economic structure, ideology, and location. We find that government structures are most strongly associated with state economic structures, with location and income playing more limited roles. Voters’ recent ideological preferences about the proper roles and extent of government are not significantly associated with the scale and structure of their state governments as reflected online. We conclude that the online footprint of governments offers a broad and comprehensive window on how they are structured that can help deepen understanding of those structures.

Caenorhabditis elegans and the network control framework—FAQs

Emma K. Towlson, Petra E. Vértes, Gang Yan, Yee Lian Chew, Denise S. Walker, William R. Schafer, and Albert-László Barabási

Phil. Trans. R. Soc. B 373: 20170372

Control is essential to the functioning of any neural system. Indeed, under healthy conditions the brain must be able to continuously maintain a tight functional control between the system’s inputs and outputs. One may therefore hypothesize that the brain’s wiring is predetermined by the need to maintain control across multiple scales, maintaining the stability of key internal variables, and producing behaviour in response to environmental cues. Recent advances in network control have offered a powerful mathematical framework to explore the structure – function relationship in complex biological, social and technological networks, and are beginning to yield important and precise insights on neuronal systems. The network control paradigm promises a predictive, quantitative framework to unite the distinct datasets necessary to fully describe a nervous system, and provide mechanistic explanations for the observed structure and function relationships. Here, we provide a thorough review of the network control framework as applied to Caenorhabditis elegans (Yan et al. 2017 Nature 550,519 –523. (doi:10.1038/nature24056)), in the style of Frequently Asked Questions.We present the theoretical, computational and experimental aspects of network control, and discuss its current capabilities and limitations, together with the next likely advances and improvements. We further present thePython code to enable exploration of control principles in a manner specific to this prototypical organism.This article is part of a discussion meeting issue ‘Connectome to behaviour: modelling C. elegans at cellular resolution’.

Network-based approach to prediction and population-based validation of in silico drug repurposing

Feixiong Cheng, Rishi J. Desai, Diane E. Handy, Ruisheng Wang, Sebastian Schneeweiss, Albert-László Barabási & Joseph Loscalzo

Nature Communicationsvolume 9, Article number: 2691 (2018)

Here we identify hundreds of new drug-disease associations for over 900 FDA-approved drugs by quantifying the network proximity of disease genes and drug targets in the human (protein–protein) interactome. We select four network-predicted associations to test their causal relationship using large healthcare databases with over 220 million patients and state-of-the-art pharmacoepidemiologic analyses. Using propensity score matching, two of four network-based predictions are validated in patient-level data: carbamazepine is associated with an increased risk of coronary artery disease (CAD) [hazard ratio (HR) 1.56, 95% confidence interval (CI) 1.12–2.18], and hydroxychloroquine is associated with a decreased risk of CAD (HR 0.76, 95% CI 0.59–0.97). In vitro experiments show that hydroxychloroquine attenuates pro-inflammatory cytokine-mediated activation in human aortic endothelial cells, supporting mechanistically its potential beneficial effect in CAD. In summary, we demonstrate that a unique integration of protein-protein interaction network proximity and large-scale patient-level longitudinal data complemented by mechanistic in vitro studies can facilitate drug repurposing.

Predicting Perturbation Patterns from the Topology of Biological Networks

Marc Santolini and Albert-Laszlo Barabasi

PNAS | vol. 115 | no. 27 | E6375–E6383

High-throughput technologies, offering an unprecedented wealth of quantitative data underlying the makeup of living systems, are changing biology. Notably, the systematic mapping of the relationships between biochemical entities has fueled the rapid development of network biology, offering a suitable framework to describe disease phenotypes and predict potential drug targets. However, our ability to develop accurate dynamical models remains limited, due in part to the limited knowledge of the kinetic parameters underlying these interactions. Here, we explore the degree to which we can make reasonably accurate predictions in the absence of the kinetic parameters. We find that simple dynamically agnostic models are sufficient to recover the strength and sign of the biochemical perturbation patterns observed in 87 biological models for which the underlying kinetics are known. Surprisingly, a simple distance-based model achieves 65% accuracy. We show that this predictive power is robust to topological and kinetic parameter perturbations, and we identify key network properties that can increase up to 80% the recovery rate of the true perturbation patterns. We validate our approach using experimental data on the chemotactic pathway in bacteria, finding that a network model of perturbation spreading predicts with ∼80% accuracy the directionality of gene expression and phenotype changes in knock-out and overproduction experiments. These findings show that the steady advances in mapping out the topology of biochemical interaction networks opens avenues for accurate perturbation spread modeling, with direct implications for medicine and drug development.

Success In Books: A Big Data Approach to Bestsellers

Burcu Yucesoy, Xindi Wang, Junming Huan, Albert-Laszlo Barabasi

EPJ Data Science 7:7

Reading remains the preferred leisure activity for most individuals, continuing to offera unique path to knowledge and learning. As such, books remain an importantcultural product, consumed widely. Yet, while over 3 million books are published eachyear, very few are read widely and less than 500 make it to the New York Timesbestseller lists. And once there, only a handful of authors can command the lists formore than a few weeks. Here we bring a big data approach to book success byinvestigating the properties and sales trajectories of bestsellers. We find that there areseasonal patterns to book sales with more books being sold during holidays, andeven among bestsellers, fiction books sell more copies than nonfiction books. Generalfiction and biographies make the list more often than any other genre books, and thehigher a book’s initial place in the rankings, the longer the book stays on the list aswell. Looking at patterns characterizing authors, we find that fiction writers are moreproductive than nonfiction writers, commonly achieving bestseller status withmultiple books. Additionally, there is no gender disparity among bestselling fictionauthors but nonfiction, most bestsellers are written by male authors. Finally we findthat there is a universal pattern to book sales. Using this universality we introduce astatistical model to explain the time evolution of sales. This model not onlyreproduces the entire sales trajectory of a book but also predicts the total number ofcopies it will sell in its lifetime, based on its early sales numbers. The analysis of thebestseller characteristics and the discovery of the universal nature of sales patternswith its driving forces are crucial for our understanding of the book industry, andmore generally, of how we as a society interact with cultural products.

Science of Science

Santo Fortunato, Carl T. Bergstrom, Katy Borner, James A. Evans, Dirk Helbing, Stasa Milojevic, Alexander M. Petersen, Filippo Radicchi, Roberta Sinatra, Brian Uzzi, Alessandro Vespignani, Luda Waltman, Dashun Wang, Albert-Laszlo Barabasi

Science 359: 6379 (2018)

The science of science (SciSci) is based on a transdisciplinary approach that uses large data sets to study the mechanisms underlying the doing of science--from the choice of a research problem to career trajectories and progress within a field. In a Review, Fortunato et al. explain that the underlying rationale is that with a deeper understanding of the precursors of impactful science, it will be possible to develop systems and policies that improve each scientist's ability to succeed and enhance the prospects of science as a whole.

The Fundamental Advantages of Temporal Networks

A. Li, S. P. Cornelius, Y.-Y. Liu, L. Wang, A.-L. Barabasi

Science 358:6366, 1042-1046 (2017).

Most networked systems of scientific interest are characterized by temporal links, meaning the network’s structure changes over time. Link temporality has been shown to hinder many dynamical processes, from information spreading to accessibility, by disrupting network paths. Considering the ubiquity of temporal networks in nature, we ask: Are there any advantages of the networks’ temporality? We use an analytical framework to show that temporal networks can, compared to their static counterparts, reach controllability faster, demand orders of magnitude less control energy, and have control trajectories, that are considerably more compact than those characterizing static networks. Thus, temporality ensures a degree of flexibility that would be unattainable in static networks, enhancing our ability to control them.

Network Control Principles Predict Neuron Function in the Caenorhabditis elegans Connectome

G. Yan, P. E. Vertes, E. K. Towlson, Y. L. Chew, S. Walker, W. R. Schafer, A.-L. Barabasi

Nature 00:000 (2017)

Recent studies on the controllability of complex systems offer a powerful mathematical framework to systematically explore the structure–function relationship in biological, social, and technological networks 1, 2, 3. Despite theoretical advances, we lack direct experimental proof of the validity of these widely used control principles. Here we fill this gap by applying a control framework to the connectome of the nematode Caenorhabditis elegans 4, 5, 6, allowing us to predict the involvement of each C. elegans neuron in locomotor behaviours. We predict that control of the muscles or motor neurons requires 12 neuronal classes, which include neuronal groups previously implicated in locomotion by laser ablation 7, 8, 9, 10, 11, 12, 13, as well as one previously uncharacterized neuron, PDB. We validate this prediction experimentally, finding that the ablation of PDB leads to a significant loss of dorsoventral polarity in large body bends. Importantly, control principles also allow us to investigate the involvement of individual neurons within each neuronal class. For example, we predict that, within the class of DD motor neurons, only three (DD04, DD05, or DD06) should affect locomotion when ablated individually. This prediction is also confirmed; single cell ablations of DD04 or DD05 specifically affect posterior body movements, whereas ablations of DD02 or DD03 do not. Our predictions are robust to deletions of weak connections, missing connections, and rewired connections in the current connectome, indicating the potential applicability of this analytical framework to larger and less well-characterized connectomes.

The Elegant Law that Governs Us All

A.-L. Barabasi

Science 357:6347 (2017)

A physicist probes a phenomenon seen in cells, cities, and almost everything in between.

Academia Under Fire in Hungary

A.-L. Barabasi

Science 356: 6338 (2017)

On 10 April, Hungarian President Janos Ader signed into law an amendment to the National Higher Education Law that would outlaw the Central European University (CEU). Although portrayed by the government as a purely administrative step, the "Lex-CEU" law is a strident attempt to curtail academic freedom and limit the independence of academic institutions.

Identifying and modeling the structural discontinuities of human interactions

S. Grauwin, M. Szell, S. Sobolevsky, P. Hovel, F. Simini, M. Vanhoof, Z. Smoreda, A.-L. Barabasi & C. Ratti

Scientific Reports 7: 46677 (2017)

The idea of a hierarchical spatial organization of society lies at the core of seminal theories in human geography that have strongly influenced our understanding of social organization. Along the same line, the recent availability of large-scale human mobility and communication data has offered novel quantitative insights hinting at a strong geographical confinement of human interactions within neighboring regions, extending to local levels within countries. However, models of human interaction largely ignore this effect. Here, we analyze several country-wide networks of telephone calls - both mobile and landline - and in either case uncover a systematic decrease of communication induced by borders we identify as the missing variable in state-of-the-art models. Using this empirical evidence, we propose an alternative modeling framework that naturally stylizes the damping effect of borders. We show that this new notion substantially improves the predictive power of widely used interaction models. This increases our ability to understand, model and predict social activities and to plan the development of infrastructures across multiple scales.

Integrating Personalized Gene Expression Profiles into Predictive Disease-associated Gene Pools

J. Menche, E. Guney, A. Sharma, P. J. Branigan, M. J. Loza, F. Baribaud, R. Dobrin, A.-L. Barabasi

Systems Biology and Applications 3:10 (2017)

Gene expression data are routinely used to identify genes that on average exhibit different expression levels between a case and a control group. Yet, very few of such differentially expressed genes are detectably perturbed in individual patients. Here, we develop a framework to construct personalized perturbation profiles for individual subjects, identifying the set of genes that are significantly perturbed in each individual. This allows us to characterize the heterogeneity of the molecular manifestations of complex diseases by quantifying the expression-level similarities of complex diseases by quantifying the expression-level similarities and differences among patients with the same phenotype. We show that despite the high heterogeneity of the individual perturbation profiles, patients with asthma, Parkinson and Huntington's disease share a broadpool of sporadically disease-associated genes, and that individuals with statistically significant overlap with this pool have a 80-100% chance of being diagnosed with the disease. The developed framework opens up the possibility to apply gene expression data in the context of precision medicine, with important implications for biomarker identification, drug development, diagnosis and treatment.

From Comorbidities of Chronic Obstructive Pulmonary Disease to Identification of Shared Molecular Mechanisms by Data Integration

D. Gomez-Cabrero, J. Menche, C. Vargas, I. Cano, D. Maier, A.-L. Barabasi, J. Tegner, J. Roca (Synergy-COPD Consortia)

BMC Bioinformatics 17: 1291 (2016)

Background Deep mining of healthcare data has provided maps of comorbidity relationships between diseases. In parallel, integrative multi-omics investigations have generated high-resolution molecular maps of putative relevance for understanding disease initiation and progression. Yet, it is unclear how to advance an observation of comorbidity relations (one disease to others) to a molecular understanding of the driver processes and associated biomarkers. Results Since Chronic Obstructive Pulmonary disease (COPD) has emerged as a central hub in temporal comorbidity networks, we developed a systematic integrative data-driven framework to identify shared disease-associated genes and pathways, as a proxy for the underlying generative mechanisms inducing comorbidity. We integrated records from approximately 13 M patients from the Medicare database with disease-gene maps that we derived from several resources including a semantic-derived knowledge-base. Using rank-based statistics we not only recovered known comorbidities but also discovered a novel association between COPD and digestive diseases. Furthermore, our analysis provides the first set of COPD co-morbidity candidate biomarkers, including IL15, TNF and JUP, and characterizes their association to aging and life-style conditions, such as smoking and physical activity. Conclusions The developed framework provides novel insights in COPD and especially COPD co-morbidity associated mechanisms. The methodology could be used to discover and decipher the molecular underpinning of other comorbidity relationships and furthermore, allow the identification of candidate co-morbidity biomarkers.

Quantifying the Evolution of Individual Scientific Impact

R. Sinatra, D. Wang, P. Deville, C. Song, A.-L. Barabasi

Science 4: 354, 6312 (November 2016)

Despite the frequent use of numerous quantitative indicators to gauge the professional impact of a scientist, little is known about how scientific impact emerges and evolves in time. Here, we quantify the changes in impact and productivity throughout a career in science, finding that impact, as measured by influential publications, is distributed randomly within a scientist’s sequence of publications. This random-impact rule allows us to formulate a stochastic model that uncouples the effects of productivity, individual ability, and luck and unveils the existence of universal patterns governing the emergence of scientific success. The model assigns a unique individual parameter Q to each scientist, which is stable during a career, and it accurately predicts the evolution of a scientist’s impact, from the h-index to cumulative citations, and independent recognitions, such as prizes.

Controllability of multiplex, multi-time-scale networks

M. Posfai, J. Gao, S. P. Cornelius, A.-L. Barabasi, R. D'Souza

Physical Review E 94: 3, 032316 (2016)

The paradigm of layered networks is used to describe many real-world systems, from biological networks to social organizations and transportation systems. While recently there has been much progress in understanding the general properties of multilayer networks, our understanding of how to control such systems remains limited. One fundamental aspect that makes this endeavor challenging is that each layer can operate at a different time scale; thus, we cannot directly apply standard ideas from structural control theory of individual networks. Here we address the problem of controlling multilayer and multi-time-scale networks focusing on two-layer multiplex networks with one-to-one interlayer coupling. We investigate the practically relevant case when the control signal is applied to the nodes of one layer. We develop a theory based on disjoint path covers to determine the minimum number of inputs (Ni) necessary for full control. We show that if both layers operate on the same time scale, then the network structure of both layers equally affect controllability. In the presence of time-scale separation, controllability is enhanced if the controller interacts with the faster layer: Ni decreases as the time-scale difference increases up to a critical time-scale difference, above which Ni remains constant and is completely determined by the faster layer. We show that the critical time-scale difference is large if layer I is easy and layer II is hard to control in isolation. In contrast, control becomes increasingly difficult if the controller interacts with the layer operating on the slower time scale and increasing time-scale separation leads to increased Ni, again up to a critical value, above which Ni still depends on the structure of both layers. This critical value is largely determined by the longest path in the faster layer that does not involve cycles. By identifying the underlying mechanisms that connect time-scale difference and controllability for a simplified model, we provide crucial insight into disentangling how our ability to control real interacting complex systems is affected by a variety of sources of complexity.

Control Principles of Complex Systems

Y.-Y. Liu and A.-L. Barabasi

Review of Modern Physics 88: 3, 035006-035064 (2016)

A reflection of our ultimate understanding of a complex system is our ability to control its behavior. Typically, control has multiple prerequisites: it requires an accurate map of the network that governs the interactions between the system’s components, a quantitative description of the dynamical laws that govern the temporal behavior of each component, and an ability to influence the state and temporal behavior of a selected subset of the components. With deep roots in dynamical systems and control theory, notions of control and controllability have taken a new life recently in the study of complex networks, inspiring several fundamental questions: What are the control principles of complex systems? How do networks organize themselves to balance control with functionality? To address these questions here recent advances on the controllability and the control of complex networks are reviewed, exploring the intricate interplay between the network topology and dynamical laws. The pertinent mathematical results are matched with empirical findings and applications. Uncovering the control principles of complex systems can help us explore and ultimately understand the fundamental laws that govern their behavior.

Control of Fluxes in Metabolic Networks

G. Basler, Z. Nikoloski, A. Larhlimi, A.-L. Barabasi, and Y.-Y. Liu

Genome Research 7: 26, 956-968 (2016)

Understanding the control of large-scale metabolic networks is central to biology and medicine. However, existing approaches either require specifying a cellular objective or can only be used for small networks. We introduce new coupling types describing the relations between reaction activities, and develop an efficient computational framework, which does not require any cellular objective for systematic studies of large-scale metabolism. We identify the driver reactions facilitating control of 23 metabolic networks from all kingdoms of life. We find that unicellular organisms require a smaller degree of control than multicellular organisms. Driver reactions are under complex cellular regulation in Escherichia coli, indicating their preeminent role in facilitating cellular control. In human cancer cells, driver reactions play pivotal roles in malignancy and represent potential therapeutic targets. The developed framework helps us gain insights into regulatory principles of diseases and facilitates design of engineering strategies at the interface of gene regulation, signaling, and metabolism.

Scaling Identity Connects Human Mobility and Social Interactions

P. Deville, C. Song, N. Eagle, V. D. Blondel, A.-L. Barabasi, D. Wang

PNAS 113: 26, 7047-7052 (2016)

Both our mobility and communication patterns obey spatial constraints: Most of the time, our trips or communications occur over a short distance, and occasionally, we take longer trips or call a friend who lives far away. These spatial dependencies, best described as power laws, play a consequential role in broad areas ranging from how an epidemic spreads to diffusion of ideas and information. Here we established the first formal link, to our knowledge, between mobility and communication patterns by deriving a scaling relationship connecting them. The uncovered scaling theory not only allows us to derive human movements from communication volumes, or vice versa, but it also documents a new degree of regularity that helps deepen our quantitative understanding of human behavior. Massive datasets that capture human movements and social interactions have catalyzed rapid advances in our quantitative understanding of human behavior during the past years. One important aspect affecting both areas is the critical role space plays. Indeed, growing evidence suggests both our movements and communication patterns are associated with spatial costs that follow reproducible scaling laws, each characterized by its specific critical exponents. Although human mobility and social networks develop concomitantly as two prolific yet largely separated fields, we lack any known relationships between the critical exponents explored by them, despite the fact that they often study the same datasets. Here, by exploiting three different mobile phone datasets that capture simultaneously these two aspects, we discovered a new scaling relationship, mediated by a universal flux distribution, which links the critical exponents characterizing the spatial dependencies in human mobility and social networks. Therefore, the widely studied scaling laws uncovered in these two areas are not independent but connected through a deeper underlying reality.

Untangling performance from success

B. Yucesoy, A.-L. Barabási

EPJ Data Science 5 (1), 17

Fame, popularity and celebrity status, frequently used tokens of success, are often loosely related to, or even divorced from professional performance. This dichotomy is partly rooted in the difficulty to distinguish performance, an individual measure that captures the actions of a performer, from success, a collective measure that captures a community’s reactions to these actions. Yet, finding the relationship between the two measures is essential for all areas that aim to objectively reward excellence, from science to business. Here we quantify the relationship between performance and success by focusing on tennis, an individual sport where the two quantities can be independently measured. We show that a predictive model, relying only on a tennis player’s performance in tournaments, can accurately predict an athlete’s popularity, both during a player’s active years and after retirement. Hence the model establishes a direct link between performance and momentary popularity. The agreement between the performance-driven and observed popularity suggests that in most areas of human achievement exceptional visibility may be rooted in detectable performance measures.

Controllability Analysis of the Directed Human Protein Interaction Network Identifies Disease Genes and Drug Targets

A. Vinayagama, T.E. Gibsonb, H.-J. Lee, B. Yilmazeld, C. Roeseld, Y. Hua, Y. Kwona, A. Sharma, Y.-Y. Liu, N. Perrimona, A.-L. Barabasi

Proceedings of the National Academy of Sciences 10.1073/pnas.1603992113, 1-6 (2016)

The protein-protein interaction (PPI) network is crucial for cellular information processing and decision-making. With suitable inputs, PPI networks drive the cells to diverse functional outcomes such as cell proliferation or cell death. Here, we characterize the structural controllability of a large directed human PPI network comprising 6,339 proteins and 34,813 interactions. This network allows us to classify proteins as "indispensable," "neutral," or "dispensable," which correlates to increasing, no effect, or decreasing the number of driver nodes in the network upon removal of that protein. We find that 21% of the proteins in the PPI network are indispensable. Interestingly, these indispensable proteins are the primary targets of disease-causing mutations, human viruses, and drugs, suggesting that altering a networks control property is critical for the transition between healthy and disease states. Furthermore, analyzing copy number alterations data from 1,547 cancer patients reveals that 56 genes that are frequently amplified or deleted in nine different cancers are indispensable. Among the 56 genes, 46 of them have not been previously associated with cancer. This suggests that controllability analysis is very useful in identifying novel disease genes and potential drug targets.

The Network Behind the Cosmic Web

B.C. Coutinho, S. Hong, K. Albrecht, A. Day, A.-L. Barabasi, P. Torrey, M. Vogelsberger, L. Hernquist

arXiv:1604.03236v2 (13 April 2016)

The concept of the cosmic web, viewing the universe as a set of discrete galaxies held together by gravity, is deeply ingrained in cosmology. Yet, little is known about the most effective construction and the characteristics of the underlying network. Here we explore seven network construction algorithms that use various galaxy distributions provided by both simulations and observations. We find that a model relying only on spatial proximity offers the best correlations between the physical characteristics of the connected galaxies. We show that the properties of the networks generated and from simulations and observations are identical, unveiling a deep universality of the cosmic web.

Universal resilience patterns in complex networks

J. Gao, B. Barzel, A.-L. Barabási

Nature 530, 307-312 (2016)

Resilience, a system’s ability to adjust its activity to retain its basic functionality when errors, failures and environmental changes occur, is a defining property of many complex systems. Despite widespread consequences for human health, the economy and the environment, events leading to loss of resilience—from cascading failures in technological systems to mass extinctions in ecological networks—are rarely predictable and are often irreversible. These limitations are rooted in a theoretical gap: the current analytical framework of resilience is designed to treat low-dimensional models with a few interacting components, and is unsuitable for multi-dimensional systems consisting of a large number of components that interact through a complex network. Here we bridge this theoretical gap by developing a set of analytical tools with which to identify the natural control and state parameters of a multi-dimensional complex system, helping us derive effective one-dimensional dynamics that accurately predict the system’s resilience. The proposed analytical framework allows us systematically to separate the roles of the system’s dynamics and topology, collapsing the behaviour of different networks onto a single universal resilience function. The analytical results unveil the network characteristics that can enhance or diminish resilience, offering ways to prevent the collapse of ecological, biological or economic systems, and guiding the design of technological systems resilient to both internal failures and environmental changes.

Network-based in silico drug efficacy screening

E. Guney, J. Menche, M. Vidal, A.-L. Barabási

Nature Communications 7:10331, 1-13 (2016)

The increasing cost of drug development together with a significant drop in the number of new drug approvals raises the need for innovative approaches for target identification and efficacy prediction. Here, we take advantage of our increasing understanding of the network-based origins of diseases to introduce a drug-disease proximity measure that quantifies the interplay between drugs targets and diseases. By correcting for the known biases of the interactome, proximity helps us uncover the therapeutic effect of drugs, as well as to distinguish palliative from effective treatments. Our analysis of 238 drugs used in 78 diseases indicates that the therapeutic effect of drugs is localized in a small network neighborhood of the disease genes and highlights efficacy issues for drugs used in Parkinson and several inflammatory disorders. Finally, network-based proximity allows us to predict novel drug-disease associations that offer unprecedented opportunities for drug repurposing and the detection of adverse effects.

Tissue Specificity of Human Disease Module

M. Kitsak, A. Sharma, J. Menche, E. Guney, S. D. Ghiassian, J. Loscalzo, A.-L. Barabasi

Scientific Reports 6: 35241 (2016)

Genes carrying mutations associated with genetic diseases are present in all human cells; yet, clinical manifestations of genetic diseases are usually highly tissue-specific. Although some disease genes are expressed only in selected tissues, the expression patterns of disease genes alone cannot explain the observed tissue specificity of human diseases. Here we hypothesize that for a disease to manifest itself in a particular tissue, a whole functional subnetwork of genes (disease module) needs to be expressed in that tissue. Driven by this hypothesis, we conducted a systematic study of the expression patterns of disease genes within the human interactome. We find that genes expressed in a specific tissue tend to be localized in the same neighborhood of the interactome. By contrast, genes expressed in different tissues are segregated in distinct network neighborhoods. Most important, we show that it is the integrity and the completeness of the expression of the disease module that determines disease manifestation in selected tissues. This approach allows us to construct a disease-tissue network that confirms known and predicts unexpected disease-tissue associations.

Endophenotype Network Models: Common Core of Complex Diseases

S. D. Ghiassian, J. Menche, D. I. Chasman, F. Giulianini, R. Wang, P. Ricchiuto, M. Aikawa, H. Iwata, C. Muller, T. Zeller, A. Sharma, P. Wild, K. Lackner, S. Singh, P. M. Ridker, S. Blankenberg, A.-L. Barabasi, J. Loscalzo

Scientific Reports 6: 27414, 1-13 (2016)

Historically, human diseases have been differentiated and categorized based on the organ system in which they primarily manifest. Recently, an alternative view is emerging that emphasizes that different diseases often have common underlying mechanisms and shared intermediate pathophenotypes, or endo(pheno)types. Within this framework, a specific disease’s expression is a consequence of the interplay between the relevant endophenotypes and their local, organ-based environment. Important examples of such endophenotypes are inflammation, fibrosis, and thrombosis and their essential roles in many developing diseases. In this study, we construct endophenotype network models and explore their relation to different diseases in general and to cardiovascular diseases in particular. We identify the local neighborhoods (module) within the interconnected map of molecular components, i.e., the subnetworks of the human interactome that represent the inflammasome, thrombosome, and fibrosome. We find that these neighborhoods are highly overlapping and significantly enriched with disease-associated genes. In particular they are also enriched with differentially expressed genes linked to cardiovascular disease (risk). Finally, using proteomic data, we explore how macrophage activation contributes to our understanding of inflammatory processes and responses. The results of our analysis show that inflammatory responses initiate from within the cross-talk of the three identified endophenotypic modules.

Canonical genetic signatures of the adult human brain

M. Hawrylycz, J. A. Miller, V. Menon, D. Feng, T. Dolbeare, A. L. Guillozet-Bongaarts, A. G. Jegga, B. J. Aronow, C.-K. Lee, A. Bernard, M. F. Glasser, D. L. Dierker, J. Menche, A. Szafer, F. Collman, P. Grange, K. A. Berman, S. Mihalas, Z. Yao, L. Stewart, A.-L. Barabási, J. Schulkin, J. Phillips, L. Ng, C. Dang, D. R. Haynor, A. Jones, D. C. Van Essen, C. Koch, D. Lein

Nature Neuroscience 4171, 1-15 (2015)

The structure and function of the human brain are highly stereotyped, implying a conserved molecular program responsible for its development, cellular structure and function. We applied a correlation-based metric called differential stability to assess reproducibility of gene expression patterning across 132 structures in six individual brains, revealing mesoscale genetic organization. The genes with the highest differential stability are highly biologically relevant, with enrichment for brain-related annotations, disease associations, drug targets and literature citations. Using genes with high differential stability, we identified 32 anatomically diverse and reproducible gene expression signatures, which represent distinct cell types, intracellular components and/or associations with neurodevelopmental and neurodegenerative disorders. Genes in neuron-associated compared to non-neuronal networks showed higher preservation between human and mouse; however, many diversely patterned genes displayed marked shifts in regulation between species. Finally, highly consistent transcriptional architecture in neocortex is correlated with resting state functional connectivity, suggesting a link between conserved gene expression and functionally relevant circuitry.

Returners and explorers dichotomy in human mobility

L. Pappalardo, F. Simini, S. Rinzivillo, D. Pedreschi, F. Giannotti, A.-L. Barabási

Nature Communications 6:8166, 1-8 (2015)

The availability of massive digital traces of human whereabouts has offered a series of novel insights on the quantitative patterns characterizing human mobility. In particular, numerous recent studies have lead to an unexpected consensus: the considerable variability in the characteristic travelled distance of individuals coexists with a high degree of predictability of their future locations. Here we shed light on this surprising coexistence by systematically investigating the impact of recurrent mobility on the characteristic distance travelled by individuals. Using both mobile phone and GPS data, we discover the existence of two distinct classes of individuals: returners and explorers. As existing models of human mobility cannot explain the existence of these two classes, we develop more realistic models able to capture the empirical findings. Finally, we show that returners and explorers play a distinct quantifiable role in spreading phenomena and that a correlation exists between their mobility patterns and social interactions.

Spectrum of controlling and observing complex networks

G. Yan, G. Tsekenis, B. Barzel, J.-J. Slotine, Y.-Y. Liu, A.-L. Barabási

Nature Physics 11, 779-796 (2015)

Recent studies have made important advances in identifying sensor or driver nodes, through which we can observe or control a complex system. But the observational uncertainty induced by measurement noise and the energy required for control continue to be significant challenges in practical applications. Here we show that the variability of control energy and observational uncertainty for different directions of the state space depend strongly on the number of driver nodes. In particular, we find that if all nodes are directly driven, control is energetically feasible, as the maximum energy increases sub-linearly with the system size. If, however, we aim to control a system through a single node, control in some directions is energetically prohibitive, increasing exponentially with the system size. For the cases in between, the maximum energy decays exponentially when the number of driver nodes increases. We validate our findings in several model and real networks, arriving at a series of fundamental laws to describe the control energy that together deepen our understanding of complex systems.

Constructing minimal models for complex system dynamics

B. Barzel, Y.-Y. Liu, A.-L. Barabási

Nature Communications 6:7186, 1-8 (2015)

One of the strengths of statistical physics is the ability to reduce macroscopic observations into microscopic models, offering a mechanistic description of a system’s dynamics. This paradigm, rooted in Boltzmann’s gas theory, has found applications from magnetic phenomena to subcellular processes and epidemic spreading. Yet, each of these advances were the result of decades of meticulous model building and validation, which are impossible to replicate in most complex biological, social or technological systems that lack accurate microscopic models. Here we develop a method to infer the microscopic dynamics of a complex system from observations of its response to external perturbations, allowing us to construct the most general class of nonlinear pairwise dynamics that are guaranteed to recover the observed behavior. The result, which we test against both numerical and empirical data, is an effective dynamic model that can predict the system’s behavior and provide crucial insights into its inner workings.
The observation that disease associated proteins often interact with each other has fueled the development of network-based approaches to elucidate the molecular mechanisms of human disease. Such approaches build on the assumption that protein interaction networks can be viewed as maps in which diseases can be identified with localized perturbation within a certain neighborhood. The identification of these neighborhoods, or disease modules, is therefore a prerequisite of a detailed investigation of a particular pathophenotype. While numerous heuristic methods exist that successfully pinpoint disease associated modules, the basic underlying connectivity patterns remain largely unexplored. In this work we aim to fill this gap by analyzing the network properties of a comprehensive corpus of 70 complex diseases. We find that disease associated proteins do not reside within locally dense communities and instead identify connectivity significance as the most predictive quantity. This quantity inspires the design of a novel Disease Module Detection (DIAMOnD) algorithm to identify the full disease module around a set of known disease proteins. We study the performance of the algorithm using well-controlled synthetic data and systematically validate the identified neighborhoods for a large corpus of diseases.

Uncovering disease-disease relationships through the incomplete interactome

J. Menche, A. Sharma, M. Kitsak, D. Ghiassian, M. Vidal, J. Loscazlo, A.-L. Barabasi

Science 347:6224, 1257601-1 (2015)

According to the disease module hypothesis, the cellular components associated with a disease segregate in the same neighborhood of the human interactome, the map of biologically relevant molecular interactions. Yet, given the incompleteness of the interactome and the limited knowledge of disease-associated genes, it is not obvious if the available data have sufficient coverage to map out modules associated with each disease. Here we derive mathematical conditions for the identifiability of disease modules and show that the network-based location of each disease module determines its pathobiological relationship to other diseases. For example, diseases with overlapping network modules show significant coexpression patterns, symptom similarity, and comorbidity, whereas diseases residing in separated network neighborhoods are phenotypically distinct. These tools represent an interactome-based platform to predict molecular commonalities between phenotypically related diseases, even if they do not share primary disease genes.

A disease module in the interactome explains disease heterogeneity, drug response and captures novel pathways and genes in asthma

A. Sharma, J. Menche, C. C. Huang, T. Ort, X. Zhou, M. Kitsak, N. Sahni, D. Thibault, L. Voung, F. Guo, S. D. Ghiassian, N. Gulbahce, F. Baribaud, J. Tocker, R. Dobrin, E. Barnathan, H. Liu, R. A. Panettieri Jr., K. G. Tantisira, W. Qiu, B. A. Raby, E. K. Silverman, M. Vidal, S. T. Weiss, and A.-L. Barabási

Human Molecular Genetics 101093, 1-16 (2015)

Recent advances in genetics have spurred rapid progress towards the systematic identification of genes involved in complex diseases. Still, the detailed understanding of the molecular and physiological mechanisms through which these genes affect disease phenotypes remains a major challenge. Here, we identify the asthma disease module, i.e. the local neighborhood of the interactome whose perturbation is associated with asthma, and validate it for functional and pathophysiological relevance, using both computational and experimental approaches. We find that the asthma disease module is enriched with modest GWAS P-values against the background of random variation, and with differentially expressed genes from normal and asthmatic fibroblast cells treated with an asthma-specific drug. The asthma module also contains immune response mechanisms that are shared with other immune-related disease modules. Further, using diverse omics (genomics,gene-expression, drug response) data,we identify the GAB1 signaling pathway as an important novel modulator in asthma. The wiring diagram of the uncovered asthma module suggests a relatively close link between GAB1 and glucocorticoids (GCs), which we experimentally validate, observing an increase in the level of GAB1 after GC treatment in BEAS-2B bronchial epithelial cells. The siRNA knockdown of GAB1 in the BEAS-2B ce

Destruction perfected

I. A. Kovács, A.-L. Barabási

Nature (News & Views) 524, 38-39 (2015)

Pinpointing the nodes whose removal most effectively disrupts a network has become a lot easier with the development of an efficient algorithm. Potential applications might include cybersecurity and disease control. See Letter p.65, by F. Morone and H. A. Makse (Supplementary 1).

A proteome-scale map of the human interactome network

T. Rolland, M. Tasan, , B. Charloteaux, S. J. Pevzner,, Q. Zhong, N. Sahni, S. Yi,, I. Lemmens, C. Fontanillo,, R. Mosca, A. Kamburov, , S. D. Ghiassian, X. Yang,, L. Ghamsari, D. Balcha,, B. E. Begg, P. Braun, M. Brehm, M. P. Froly, A.-R. Carvunis, D, Convery-Zupan, R. Carominas,, J. Coulombe-Huntington, , E. Dann, M. Dreze, A. Dricot,, C. Fan, E. Franzosa, F. Gebrea, B. J. Gutierrez, M. F. Hardy,, M. Jin, S. Kang, R. Kiros, G. , Lin, K. Luck, A. MacWilliams,, J. Menche, R R. Murray, A., Palagi, M. M. Poulin, X. , Rambout, J. Rasla, P. Reichert, V. Romero, E. Ruyssinck, J. M., Sahalie, plus 20 more co-authors

Cell 159:5, 1212-1226 (2014)

Just as reference genome sequences revolutionized human genetics, reference maps of interactome networks will be critical to fully understand genotype-phenotype relationships. Here, we describe a systematic map of ∼14,000 high-quality human binary protein-protein interactions. At equal quality, this map is ∼30% larger than what is available from small-scale studies published in the literature in the last few decades. While currently available information is highly biased and only covers a relatively small portion of the proteome, our systematic map appears strikingly more homogeneous, revealing a “broader” human interactome network than currently appreciated. The map also uncovers significant interconnectivity between known and candidate cancer gene products, providing unbiased evidence for an expanded functional cancer landscape, while demonstrating how high-quality interactome models will help “connect the dots” of the genomic revolution.

Collective credit allocation in science

H.-W. Shen, A.-L. Barabasi

Proceedings of the National Academy of Sciences 10.1073/pnas.1401992111, 1-6 (2014)

Collaboration among researchers is an essential component of the modern scientific enterprise, playing a particularly important role in multidisciplinary research. However, we _continue to wrestle with allocating credit to the coauthors of publications with multiple authors, because the relative contribution of each author is difficult to determine. At the same time, the scientific community runs an informal field-dependent credit allocation process that assigns credit in a collective fashion to each work. Here we develop a credit allocation algorithm that captures the coauthors’ contribution to a publication as perceived by the scientific community, reproducing the informal collective credit allocation of science. We validate the method by identifying the authors of Nobel-winning papers that are credited for the discovery, independent of their positions in the author list. The method can also compare the relative impact of researchers working in the same field, even if they did not publish together. The ability to accurately measure the relative credit of researchers could affect many aspects of credit allocation in science, potentially impacting hiring, funding, and promotion decisions.

A network framework of cultural history

M. Schich, C. Song, Y. Y. Ahn, A. Mirsky, M. Martino, A.-L. Barabási, D. Helbing

Science 345, 558-562 (2014)

The emergent processes driving cultural history are a product of complex interactions among large numbers of individuals, determined by difficult-to-quantify historical conditions. To characterize these processes, we have reconstructed aggregate intellectual mobility over two millennia through the birth and death locations of more than 150,000 notable individuals. The tools of network and complexity theory were then used to identify characteristic statistical patterns and determine the cultural and historical relevance of deviations. The resulting network of locations provides a macroscopic perspective of cultural history, which helps us to retrace cultural narratives of Europe and North America using large-scale visualization and quantitative dynamical tools and to derive historical trends of cultural centers beyond the scope of specific events or narrow time intervals.

A genetic epidemiology approach to cyber-security

S. Gil, A. Kott, A.-L. Barabási

Scientific Reports 4:5659, 1-7 (2014)

While much attention has been paid to the vulnerability of computer networks to node and link failure, there is limited systematic understanding of the factors that determine the likelihood that a node (computer) is compromised. We therefore collect threat log data in a university network to study the patterns of threat activity for individual hosts. We relate this information to the properties of each host as observed through network-wide scans, establishing associations between the network services a host is running and the kinds of threats to which it is susceptible. We propose a methodology to associate services to threats inspired by the tools used in genetics to identify statistical associations between mutations and diseases. The proposed approach allows us to determine probabilities of infection directly from observation, offering an automated high-throughput strategy to develop comprehensive metrics for cyber-security.

Human symptoms–disease network

X. Z. Zhou, J. Menche, A.-L. Barabási, A. Sharma

Nature Communications 5:4212, 1-10 (2014)

In the post-genomic era, the elucidation of the relationship between the molecular origins of diseases and their resulting phenotypes is a crucial task for medical research. Here, we use a large-scale biomedical literature database to construct a symptom-based human disease network and investigate the connection between clinical manifestations of diseases and their underlying molecular interactions. We find that the symptom-based similarity of two diseases correlates strongly with the number of shared genetic associations and the extent to which their associated proteins interact. Moreover, the diversity of the clinical manifestations of a disease can be related to the connectivity patterns of the underlying protein interaction network. The comprehensive, high-quality map of disease–symptom relations can further be used as a resource helping to address important questions in the field of systems medicine, for example, the identification of unexpected associations between diseases, disease etiology research or drug design.

Career on the move: Geography, stratification, and scientific impact

P. Deville, D. Wang, R. Sinatra, C. Song, V. Blondel, A.-L. Barabási

Scientific Reports 4, 1-7 (2014)

Changing institutions is an integral part of an academic life. Yet little is known about the mobility patterns of scientists at an institutional level and how these career choices affect scientific outcomes. Here, we examine over 420,000 papers, to track the affiliation information of individual scientists, allowing us to reconstruct their career trajectories over decades. We find that career movements are not only temporally and spatially localized, but also characterized by a high degree of stratification in institutional ranking. When cross-group movement occurs, we find that while going from elite to lower-rank institutions on average associates with modest decrease in scientific performance, transitioning into elite institutions does not result in subsequent performance gain. These results offer empirical evidence on institutional level career choices and movements and have potential implications for science policy.

A diVIsive Shuffling Approach (VIStA) for gene expression analysis to identify subtypes in Chronic Obstructive Pulmonary Disease

J. Mench, A. Sharma, M. H. Cho, R. J. Mayer, S. I. Rennard, B. Celli, B. E. Miller, N. Locantore, R. Tal-Singer, S. Ghosh, C. Larminie, G. Bradley, J. H. Riley, A. Agusti, E. K. Silverman, A.-L. Barabási

BMC Systems Biology 8, 1-13 (2014)

Background: An important step toward understanding the biological mechanisms underlying a complex disease is a refined understanding of its clinical heterogeneity. Relating clinical and molecular differences may allow us to define more specific subtypes of patients that respond differently to therapeutic interventions. Results: We developed a novel unbiased method called diVIsive Shuffling Approach (VIStA) that identifies subgroups of patients by maximizing the difference in their gene expression patterns. We tested our algorithm on 140 subjects with Chronic Obstructive Pulmonary Disease (COPD) and found four distinct, biologically and clinically meaningful combinations of clinical characteristics that are associated with large gene expression differences. The dominant characteristic in these combinations was the severity of airflow limitation. Other frequently identified measures included emphysema, fibrinogen levels, phlegm, BMI and age. A pathway analysis of the differentially expressed genes in the identified subtypes suggests that VIStA is capable of capturing specific molecular signatures within in each group. Conclusions: The introduced methodology allowed us to identify combinations of clinical characteristics that correspond to clear gene expression differences. The resulting subtypes for COPD contribute to a better understanding of its heterogeneity.

Bordering Fiction

Barabasi, A.-L.

Science 343: 6169 (2014)

Eggers portrays a world--in which an omnipotent social networking company encourages everyone to monitor everybody everywhere--that feels eerily everyday.

Quantifying information flow during emergencies

L. Gao, C. Song, Z. Gao, A.-L. Barabasi, J. P. Bagrow, D. Wang

Scientific Reports 4, 1-6 (2014)

Recent advances on human dynamics have focused on the normal patterns of human activities, with the quantitative understanding of human behavior under extreme events remaining a crucial missing chapter. This has a wide array of potential applications, ranging from emergency response and detection to traffic control and management. Previous studies have shown that human communications are both temporally and spatially localized following the onset of emergencies, indicating that social propagation is a primary means to propagate situational awareness. We study real anomalous events using country-wide mobile phone data, finding that information flow during emergencies is dominated by repeated communications. We further demonstrate that the observed communication patterns cannot be explained by inherent reciprocity in social networks, and are universal across different demographics.

Modeling and predicting popularity dynamics via reinforced poisson processes

H. Shen, D. Wang, C. Song, A.-L. Barabási

Proceedings of the Twenty-Eighth AAAI Conference on Artificial Intelligence , 291-297 (2014)

An ability to predict the popularity dynamics of individual items within a complex evolving system has important implications in an array of areas. Here we propose a generative probabilistic framework using a reinforced Poisson process to explicitly model the process through which individual items gain their popularity. This model distinguishes itself from existing models via its capability of modeling the arrival process of popularity and its remarkable power at predicting the popularity of individual items. It possesses the flexibility of applying Bayesian treatment to further improve the predictive power using a conjugate prior. Extensive experiments on a longitudinal citation dataset demonstrate that this model consistently outperforms existing popularity prediction methods.

Target control of complex networks

Jianxi Gao, Y.-Y.Liu, R. M. D'Souza, A.-L. Barabási

Nature Communications 5:5415, 1-7 (2014)

Controlling large natural and technological networks is an outstanding challenge. It is typically neither feasible nor necessary to control the entire network, prompting us to explore target control: the efficient control of a preselected subset of nodes. We show that the structural controllability approach used for full control overestimates the minimum number of driver nodes needed for target control. Here we develop an alternate ‘k-walk’ theory for directed tree networks, and we rigorously prove that one node can control a set of target nodes if the path length to each target node is unique. For more general cases, we develop a greedy algorithm to approximate the minimum set of driver nodes sufficient for target control. We find that degree heterogeneous networks are target controllable with higher efficiency than homogeneous networks and that the structure of many real-world networks are suitable for efficient target control.

Network-based analysis of genome wide association data provides novel candidate genes for lipid and lipoprotein traits

A. Sharma, N. Gulbahce, S. J. Pevzner, J. Menche, C. Ladenvall, L. Folkdersen, P. Eriksson, M. Orho-Melander, A.-L. Barabási

Molecular & Cellular Proteomics 12, 3398-3408 (2013)

Genome wide association studies (GWAS) identify susceptibility loci for complex traits, but do not identify particular genes of interest. Integration of functional and network information may help in overcoming this limitation and identifying new susceptibility loci. Using GWAS and comorbidity data, we present a network-based approach to predict candidate genes for lipid and lipoprotein traits. We apply a prediction pipeline incorporating interactome, co-expression, and comorbidity data to Global Lipids Genetics Consortium (GLGC) GWAS for four traits of interest, identifying phenotypically coherent modules. These modules provide insights regarding gene involvement in complex phenotypes with multiple susceptibility alleles and low effect sizes. To experimentally test our predictions, we selected four candidate genes and genotyped representative SNPs in the Malmö Diet and Cancer Cardiovascular Cohort. We found significant associations with LDL-C and total-cholesterol levels for a synonymous SNP (rs234706) in the cystathionine beta-synthase (CBS) gene (p = 1 × 10−5 and adjusted-p = 0.013, respectively). Further, liver samples taken from 206 patients revealed that patients with the minor allele of rs234706 had significant dysregulation of CBS (p = 0.04). Despite the known biological role of CBS in lipid metabolism, SNPs within the locus have not yet been identified in GWAS of lipoprotein traits. Thus, the GWAS-based Comorbidity Module (GCM) approach identifies candidate genes missed by GWAS studies, serving as a broadly applicable tool for the investigation of other complex disease phenotypes.

Uncovering the role of elementary processes in network evolution

G. Ghoshal, L. Chi, A.-L. Barabási

Scientifc Reports 3, 1-8 (2013)

The growth and evolution of networks has elicited considerable interest from the scientific community and a number of mechanistic models have been proposed to explain their observed degree distributions. Various microscopic processes have been incorporated in these models, among them, node and edge addition, vertex fitness and the deletion of nodes and edges. The existing models, however, focus on specific combinations of these processes and parameterize them in a way that makes it difficult to elucidate the role of the individual elementary mechanisms. We therefore formulated and solved a model that incorporates the minimal processes governing network evolution. Some contribute to growth such as the formation of connections between existing pair of vertices, while others capture deletion; the removal of a node with its corresponding edges, or the removal of an edge between a pair of vertices. We distinguish between these elementary mechanisms, identifying their specific role on network evolution.

Quantifying Long-Term Scientific Impact

D. Wang, C. Song, A.-L. Barabási

Science 342, 127-131 (2013)

The lack of predictability of citation-based measures frequently used to gauge impact, from impact factors to short-term citations, raises a fundamental question: Is there long-term predictability in citation patterns? Here, we derive a mechanistic model for the citation dynamics of individual papers, allowing us to collapse the citation histories of papers from different journals and disciplines into a single curve, indicating that all papers tend to follow the same universal temporal pattern. The observed patterns not only help us uncover basic mechanisms that govern scientific impact but also offer reliable measures of influence that may have potential policy implications.
Controlling complex systems is a fundamental challenge of network science. Recent advances indicate that control over the system can be achieved through a minimum driver node set (MDS). The existence of multiple MDS's suggests that nodes do not participate in control equally, prompting us to quantify their participations. Here we introduce control capacity quantifying the likelihood that a node is a driver node. To efficiently measure this quantity, we develop a random sampling algorithm. This algorithm not only provides a statistical estimate of the control capacity, but also bridges the gap between multiple microscopic control configurations and macroscopic properties of the network under control. We demonstrate that the possibility of being a driver node decreases with a node's in-degree and is independent of its out-degree. Given the inherent multiplicity of MDS's, our findings offer tools to explore control in various complex systems.

Network Science

Albert-László Barabási

Philosophical Transactions of The Royal Society 371, 1-3 (2013)

Professor Barabási's talk described how the tools of network science can help understand the Web's structure, development and weaknesses. The Web is an information network, in which the nodes are documents (at the time of writing over one trillion of them), connected by links. Other well-known network structures include the Internet, a physical network where the nodes are routers and the links are physical connections, and organizations, where the nodes are people and the links represent communications.

Observability of complex systems

Y.-Y. Liu, J.-J. Slotine, A.-L. Barabási

Proceedings of the National Academy of Sciences 110, 1-6 (2013)

A quantitative description of a complex system is inherently limited by our ability to estimate the system’s internal state from experimentally accessible outputs. Although the simultaneous measurement of all internal variables, like all metabolite concentrations in a cell, offers a complete description of a system’s state, in practice experimental access is limited to only a subset of variables, or sensors. A system is called observable if we can reconstruct the system’s complete internal state from its outputs. Here, we adopt a graphical approach derived from the dynamical laws that govern a system to determine the sensors that are necessary to reconstruct the full internal state of a complex system. We apply this approach to biochemical reaction systems, finding that the identified sensors are not only necessary but also sufficient for observability. The developed approach can also identify the optimal sensors for target or partial observability, helping us reconstruct selected state variables from appropriately chosen outputs, a prerequisite for optimal biomarker design. Given the fundamental role observability plays in complex systems, these results offer avenues to systematically explore the dynamics of a wide range of natural, technological and socioeconomic systems.

Emergence of bimodality in controlling complex networks

T. Jia, Y.-Y. Liu, E. Csóka, M. Pósfai, J.-J. Slotine, A.-L. Barabási

Nature Communications 4:2002, 1-6 (2013)

Our ability to control complex systems is a fundamental challenge of contemporary science. Recently introduced tools to identify the driver nodes, nodes through which we can achieve full control, predict the existence of multiple control configurations, prompting us to classify each node in a network based on their role in control. Accordingly a node is critical, intermittent or redundant if it acts as a driver node in all, some or none of the control configurations. Here we develop an analytical framework to identify the category of each node, leading to the discovery of two distinct control modes in complex systems: centralized versus distributed control. We predict the control mode for an arbitrary network and show that one can alter it through small structural perturbations. The uncovered bimodality has implications from network security to organizational research and offers new insights into the dynamics and control of complex systems.

Effect of correlations on network controllability

M. Pósfai, Y.-Y. Liu, J.-J. Slotine, A.-L. Barabási

Scientific Reports 3:1067, 1-7 (2013)

A dynamical system is controllable if by imposing appropriate external signals on a subset of its nodes, it can be driven from any initial state to any desired state in finite time. Here we study the impact of various network characteristics on the minimal number of driver nodes required to control a network. We find that clustering and modularity have no discernible impact, but the symmetries of the underlying matching problem can produce linear, quadratic or no dependence on degree correlation coefficients, depending on the nature of the underlying correlations. The results are supported by numerical simulations and help narrow the observed gap between the predicted and the observed number of driver nodes in real networks.

Universality in network dynamics

B. Barzel, A.-L. Barabási

Nature Physics 9, 673-681 (2013)

Despite significant advances in characterizing the structural properties of complex networks, a mathematical framework that uncovers the universal properties of the interplay between the topology and the dynamics of complex systems continues to elude us. Here we develop a self-consistent theory of dynamical perturbations in complex systems, allowing us to systematically separate the contribution of the network topology and dynamics. The formalism covers a broad range of steady-state dynamical processes and offers testable predictions regarding the system’s response to perturbations and the development of correlations. It predicts several distinct universality classes whose characteristics can be derived directly from the continuum equation governing the system’s dynamics and which are validated on several canonical network-based dynamical systems, from biochemical dynamics to epidemic spreading. Finally, we collect experimental data pertaining to social and biological systems, demonstrating that we can accurately uncover their universality class even in the absence of an appropriate continuum theory that governs the system’s dynamics.

Network link prediction by global silencing of indirect correlations

B. Barzel, A.-L. Barabási

Nature Biotechnology 31: Num 8, 1-8 (2013)

Predictions of physical and functional links between cellular components are often based on correlations between experimental measurements, such as gene expression. However, correlations are affected by both direct and indirect paths, confounding our ability to identify true pairwise interactions. Here we exploit the fundamental properties of dynamical correlations in networks to develop a method to silence indirect effects. The method receives as input the observed correlations between node pairs and uses a matrix transformation to turn the correlation matrix into a highly discriminative silenced matrix, which enhances only the terms associated with direct causal links. Against empirical data for Escherichia coli regulatory interactions, the method enhanced the discriminative power of the correlations by twofold, yielding >50% predictive improvement over traditional correlation measures and 6% over mutual information. Overall this silencing method will help translate the abundant correlation data into insights about a system's interactions, with applications ranging from link prediction to inferring the dynamical mechanisms governing biological networks.

Handful of papers dominates citation

A.-L. Barabási, C. Song, D. Wang

Nature 491, 40 (2012)

An ‘impact disparity’ is emerging in science — only a few papers earn the largest share of citations. This is comparable to the income disparity in the United States, known as the 1% phenomenon, where 1% of the population earns a disproportionate 17.4% of total income.

Control centrality and hierarchical structure in complex networks

Y.-Y. Liu, J.-J. Slotine, A.-L. Barabasi

PLoS One 7, e44459 (2012)

We introduce the concept of control centrality to quantify the ability of a single node to control a directed weighted network. We calculate the distribution of control centrality for several real networks and find that it is mainly determined by the network’s degree distribution. We show that in a directed network without loops the control centrality of a node is uniquely determined by its layer index or topological position in the underlying hierarchical structure of the network. Inspired by the deep relation between control centrality and hierarchical structure in a general directed network, we design an efficient attack strategy against the controllability of malicious networks.

Network science: Luck or reason

Albert-László Barabási

Nature 489, 1-2 (2012)

The concept of preferential attachment is behind the hubs and power laws seen in many networks. New results fuel an old debate about its origin, and beg the question of whether it is based on randomness or optimization.

Dynamics of ranking processes in complex systems

N. Blumm, G. Ghoshal, Z. Forro, M. Schich, G. Bianconi, J.-P. Bouchard, A.-L. Barabasi

Physical Review Letters 109, 128701:1-5 (2012)

The world is addicted to ranking: everything, from the reputation of scientists, journals, and universities to purchasing decisions is driven by measured or perceived differences between them. Here, we analyze empirical data capturing real time ranking in a number of systems, helping to identify the universal characteristics of ranking dynamics. We develop a continuum theory that not only predicts the stability of the ranking process, but shows that a noise-induced phase transition is at the heart of the observed differences in ranking regimes. The key parameters of the continuum theory can be explicitly measured from data, allowing us to predict and experimentally document the existence of three phases that govern ranking stability.

Interpreting cancer genomes using systematic host network perturbations by tumour virus proteins

O. Rozenblatt-Rosen, R. C. Deo, M. Padi, G. Adelmant, T. Rolland, M. Grace, A. Dricot, M. Askenazi, M. Tavares, S. J. Pevzner, F. Abderazzaq, D. Byrdsong, A.-R. Carvunis, A. A. Chen, J. Cheng, M. Correll, M. Durate, C. Fan, M. C. Feltkamp, S. B. Ficarro, R. Franchi, B. K. Garg, N. Gulbahce, T. Hao, A. M. Holthaus, R. James, A. Korkhin, L. Litovchick, J. C. Mar, T. R. Pak, S. Rabello, R. Rubio, Y. Shen, S. Singh, J. M. Spangle, M. Tasan, S. Wanamakter, J. T. Webber, J. Roecklein-Canfield,, E. Johannsen, A.-L. Barabasi,, R. Beroukhim, E. Kieff,, M. E. Cusick, D. E. Hill,, K. Munger, J. A. Marto,, J. Quackenbush, F. P. Roth,, J. A. DeCaprio, M. Vidal

Nature 487, 491-495 (2012)

Genotypic differences greatly influence susceptibility and resistance to disease. Understanding genotype–phenotype relationships requires that phenotypes be viewed as manifestations of network properties, rather than simply as the result of individual genomic variations. Genome sequencing efforts have identified numerous germline mutations, and large numbers of somatic genomic alterations, associated with a predisposition to cancer. However, it remains difficult to distinguish background, or ‘passenger’, cancer mutations from causal, or ‘driver’, mutations in these data sets. Human viruses intrinsically depend on their host cell during the course of infection and can elicit pathological phenotypes similar to those arising from mutations. Here we test the hypothesis that genomic variations and tumour viruses may cause cancer through related mechanisms, by systematically examining host interactome and transcriptome network perturbations caused by DNA tumour virus proteins. The resulting integrated viral perturbation data reflects rewiring of the host cell networks, and highlights pathways, such as Notch signalling and apoptosis, that go awry in cancer. We show that systematic analyses of host targets of viral proteins can identify cancer genes with a success rate on a par with their identification through functional genomics and large-scale cataloguing of tumour mutations. Together, these complementary approaches increase the specificity of cancer gene identification. Combining systems-level studies of pathogen-encoded gene products with genomic approaches will facilitate the prioritization of cancer causing driver genes to advance the understanding of the genetic basis of human cancer.

A universal model for mobility and migration patterns

Albert-László Barabási

Nature 484, 96-100 (2012)

Reductionism, as a paradigm, is expired, and complexity, as a field, is tired. Data-based mathematical models of complex systems are offering a fresh perspective, rapidly developing into a new discipline: network science.

Graph theory properties of cellular networks (Chapter 9)

B. Barzel, A. Sharma, A.-L. Barabási

Handbook of Systems Biology – Concepts and Insights (Academic Press, Elsevier) , 177-193 (2013)

Sex differences in intimate relationships

V. Palchykov, K. Kaski, J. Kertesz, A.-L. Barabási, R. Dunbar

Scientific Reports 2:370, 105 (2012)

Social networks based on dyadic relationships are fundamentally important for understanding of human sociality. However, we have little understanding of the dynamics of close relationships and how these change over time. Evolutionary theory suggests that, even in monogamous mating systems, the pattern of investment in close relationships should vary across the lifespan when post-weaning investment plays an important role in maximizing fitness. Mobile phone data sets provide a unique window into the structure and dynamics of relationships. We here use data from a large mobile phone dataset to demonstrate striking sex differences in the gender-bias of preferred relationships that reflect the way the reproductive investment strategies of both sexes change across the lifespan, i.e. women’s shifting patterns of investment in reproduction and parental care. These results suggest that human social strategies may have more complex dynamics than previously assumed and a life-history perspective is crucial for understanding them.

Flavor network and the principles of food pairing

Y.-Y. Ahn, S. E. Ahnert, J. P. Bagrow, A.-L. Barabási

Scientific Reports 196, (2011)

The cultural diversity of culinary practice, as illustrated by the variety of regional cuisines, raises the question of whether there are any general patterns that determine the ingredient combinations used in food today or principles that transcend individual tastes and recipes. We introduce a flavor network that captures the flavor compounds shared by culinary ingredients. Western cuisines show a tendency to use ingredient pairs that share many flavor compounds, supporting the so-called food pairing hypothesis. By contrast, East Asian cuisines tend to avoid compound sharing ingredients. Given the increasing availability of information on food preparation, our data-driven investigation opens new avenues towards a systematic understanding of culinary practice.

Systems biology and the future of medicine

J. Loscalzo, A.-L. Barabási

WIREs Systems Biology and Medicine 3, 619-627 (2011)

Contemporary views of human disease are based on simple correlation between clinical syndromes and pathological analysis dating from the late 19th century. Although this approach to disease diagnosis, prognosis, and treatment has served the medical establishment and society well for many years, it has serious shortcomings for the modern era of the genomic medicine that stem from its reliance on reductionist principles of experimentation and analysis. Quantitative, holistic systems biology applied to human disease offers a unique approach for diagnosing established disease, defining disease predilection, and developing individualized (personalized) treatment strategies that can take full advantage of modern molecular pathobiology and the comprehensive data sets that are rapidly becoming available for populations and individuals. In this way, systems pathobiology offers the promise of redefining our approach to disease and the field of medicine.

Few inputs can reprogram biological networks (reply by Liu et al.)

Y.-Y. Liu, J.-J. Slotine, A.-L. Barabási

Nature 473, 167-173 (2011)

Reply to Franz-Josef Muller and Andreas Schuppert (Nature 478, Pg. E4, Oct. 2011)

Human Mobility, Social Ties, and Link Prediction

D. Wang, D. Pedreschi, C. Song, F. Giannotti, A.-L. Barabasi

ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD) , (2011)

Our understanding of how individual mobility patterns shape and impact the social network is limited, but is essential for a deeper understanding of network dynamics and evolution. This question is largely unexplored, partly due to the difficulty in obtaining large-scale society-wide data that simultaneously capture the dynamical information on individual movements and social interactions. Here we address this challenge for the first time by tracking the trajectories and communication records of 6 Million mobile phone users. We find that the similarity between two individuals' movements strongly correlates with their proximity in the social network. We further investigate how the predictive power hidden in such correlations can be exploited to address a challenging problem: which new links will develop in a social network. We show that mobility measures alone yield surprising predictive power, comparable to traditional network-based measures. Furthermore, the prediction accuracy can be significantly improved by learning a supervised classifier based on combined mobility and network measures. We believe our findings on the interplay of mobility patterns and social ties offer new perspectives on not only link prediction but also network dynamics.

Ranking stability and super-stable nodes in complex networks

G. Ghoshal, A.-L. Barabási

Nature Communications 2, 1-7 (2011)

Pagerank, a network-based diffusion algorithm, has emerged as the leading method to rank web content, ecological species and even scientists. Despite its wide use, it remains unknown how the structure of the network on which it operates affects its performance. Here we show that for random networks the ranking provided by pagerank is sensitive to perturbations in the network topology, making it unreliable for incomplete or noisy systems. In contrast, in scale-free networks we predict analytically the emergence of super-stable nodes whose ranking is exceptionally stable to perturbations. We calculate the dependence of the number of super-stable nodes on network characteristics and demonstrate their presence in real networks, in agreement with the analytical predictions. These results not only deepen our understanding of the interplay between network topology and dynamical processes but also have implications in all areas where ranking has a role, from science to marketing.

Controllability of complex networks

Y.-Y. Liu, J.-J. Slotine, A.-L. Barabási

Nature 473, 167-173 (2011)

The ultimate proof of our understanding of natural or technological systems is reflected in our ability to control them. Although control theory offers mathematical tools for steering engineered and natural systems towards a desired state, a framework to control complex self-organized systems is lacking. Here we develop analytical tools to study the controllability of an arbitrary complex directed network, identifying the set of driver nodes with time-dependent control that can guide the system’s entire dynamics. We apply these tools to several real networks, finding that the number of driver nodes is determined mainly by the network’s degree distribution. We show that sparse inhomogeneous networks, which emerge in many real complex systems, are the most difficult to control, but that dense and homogeneous networks can be controlled using a few driver nodes. Counterintuitively, we find that in both model and real systems the driver nodes tend to avoid the high-degree nodes.

Geographic Constraints on Social Network Groups

J. P. Onnela, S. Arbesman, M. C. Gonzalez, A.-L. Barabasi, N. A. Christakis

PLoS One 6:4, 1-7 (2011)

Social groups are fundamental building blocks of human societies. While our social interactions have always been constrained by geography, it has been impossible, due to practical difficulties, to evaluate the nature of this restriction on social group structure. We construct a social network of individuals whose most frequent geographical locations are also known. We also classify the individuals into groups according to a community detection algorithm. We study the variation of geographical span for social groups of varying sizes, and explore the relationship between topological positions and geographic positions of their members. We find that small social groups are geographically very tight, but become much more clumped when the group size exceeds about 30 members. Also, we find no correlation between the topological positions and geographic positions of individuals within network communities. These results suggest that spreading processes face distinct structural and spatial constraints.

Collective response of human populations to large-scale emergencies

J. P. Bagrow, D. Wang, A.-L. Barabasi

PLoS One 6:3, 1-8 (2011)

Despite recent advances in uncovering the quantitative features of stationary human activity patterns, many applications,from pandemic prediction to emergency response, require an understanding of how these patterns change when thepopulation encounters unfamiliar conditions. To explore societal response to external perturbations we identified real-timechanges in communication and mobility patterns in the vicinity of eight emergencies, such as bomb attacks andearthquakes, comparing these with eight non-emergencies, like concerts and sporting events. We find that communicationspikes accompanying emergencies are both spatially and temporally localized, but information about emergencies spreadsglobally, resulting in communication avalanches that engage in a significant manner the social network of eyewitnesses.These results offer a quantitative view of behavioral changes in human activity under extreme conditions, with potentiallong-term impact on emergency detection and response.

Interactome Networks and Human Disease

M. Vidal, M. E. Cusick, A.-L. Barabasi

Cell 144, 986-995 (2011)

Complex biological systems and cellular networks may underlie most genotype to phenotype relationships. Here, we review basic concepts in network biology, discussing different types of interactome networks and the insights that can come from analyzing them. We elaborate on why interactome networks are important to consider in biology, how they can be mapped and integratedwith each other, what global properties are starting to emerge from interactome network models, and how these properties may relate to human disease.

Small but slow world: How network topology and burstiness slow down spreading

M. Karsai, M. Kivelä, R. K. Pan, K. Kaski, J. Kertész, A.-L. Barabási, J. Saramäki

Physical Review E 83, 1-4 (2011)

While communication networks show the small-world property of short paths, the spreading dynamics in them turns out slow. Here, the time evolution of information propagation is followed through communication networks by using empirical data on contact sequences and the susceptible-infected model. Introducing null models where event sequences are appropriately shuffled, we are able to distinguish between the contributions of different impeding effects. The slowing down of spreading is found to be caused mainly by weight-topology correlations and the bursty activity patterns of individuals.

Comparison of an expanded ataxia interactome with patient medical records reveals a relationship between macular degeneration and ataxia

J. J. Kahle, N. Gulbahce, C. A. Shaw, J. Lim, D. E. Hill, A.-L. Barabás, H. Y. Zoghbi

Human Molecular Genetics 20, 510-527 (2011)

Spinocerebellar ataxias 6 and 7 (SCA6 and SCA7) are neurodegenerative disorders caused by expansion of CAG repeats encoding polyglutamine (polyQ) tracts in CACNA1A, the alpha1A subunit of the P/Q-type calcium channel, and ataxin-7 (ATXN7), a component of a chromatin-remodeling complex, respectively. We hypothesized that finding new protein partners for ATXN7 and CACNA1A would provide insight into the biology of their respective diseases and their relationship to other ataxia-causing proteins. We identified 118 protein interactions for CACNA1A and ATXN7 linking them to other ataxia-causing proteins and the ataxia network. To begin to understand the biological relevance of these protein interactions within the ataxia network, we used OMIM to identify diseases associated with the expanded ataxia network. We then used Medicare patient records to determine if any of these diseases co-occur with hereditary ataxia. We found that patients with ataxia are at 3.03-fold greater risk of these diseases than Medicare patients overall. One of the diseases comorbid with ataxia is macular degeneration (MD). The ataxia network is significantly (P= 7.37 × 10(-5)) enriched for proteins that interact with known MD-causing proteins, forming a MD subnetwork. We found that at least two of the proteins in the MD subnetwork have altered expression in the retina of Ataxin-7(266Q/+) mice suggesting an in vivo functional relationship with ATXN7. Together these data reveal novel protein interactions and suggest potential pathways that can contribute to the pathophysiology of ataxia, MD, and diseases comorbid with ataxia.

Information Spreading in Context

D. Wang, Z. Wen, H. Tong, C.-Y. Lin, C. Song, A.-L. Barabási

Proceeding for the 20th International World Wide Web Conference, 2011 , 1-10 (2011)

Information spreading processes are central to human interactions. Despite recent studies in online domains, little is known about factors that could affect the dissemination of a single piece of information. In this paper, we address this challenge by combining two related but distinct datasets, collected from a large scale privacy-preserving distributed social sensor system. We find that the social and organizational context significantly impacts to whom and how fast people forward information. Yet the structures within spreading processes can be well captured by a simple stochastic branching model, indicating surprising independence of context. Our results build the foundation of future predictive models of information flow and provide significant insights towards design of communication platforms.

Network medicine: a network-based approach to human disease

A.-L. Barabási, N. Gulbahce, J. Loscalzo

Nature Reviews Genetics 12, 56-68 (2011)

Given the functional interdependencies between the molecular components in a human cell, a disease is rarely a consequence of an abnormality in a single gene, but reflects the perturbations of the complex intracellular and intercellular network that links tissue and organ systems. The emerging tools of network medicine offer a platform to explore systematically not only the molecular complexity of a particular disease, leading to the identification of disease modules and pathways, but also the molecular relationships among apparently distinct (patho)phenotypes. Advances in this direction are essential for identifying new disease genes, for uncovering the biological significance of disease-associated mutations identified by genome-wide association studies and full-genome sequencing, and for identifying drug targets and biomarkers for complex diseases.

Modelling the scaling properties of human mobility

C. Song, Z. Qu, N. Blumm, A.-L. Barabási

Nature Physics 7, 713- (2010)

A range of applications, from predicting the spread of human and electronic viruses to city planning and resource management in mobile communications, depend on our ability to foresee the whereabouts and mobility of individuals, raising a fundamental question: To what degree is human behavior predictable? Here we explore the limits of predictability in human dynamics by studying the mobility patterns of anonymized mobile phone users. By measuring the entropy of each individual’s trajectory, we find a 93% potential predictability in user mobility across the whole user base. Despite the significant differences in the travel patterns, we find a remarkable lack of variability in predictability, which is largely independent of the distance users cover on a regular basis.

Blueprint for antimicrobial hit discovery targeting metabolic networks

Y. Shen, L. Liu, G. Estiu, B. Isin, Y.-Y. Ahn, D.-S. Lee, A.-L. Barabásii, v. Kapatral, O. Wiest, Z. N. Oltvai

Proceedings of the National Academy of Sciences of the United States of America 10.1073, 1-6 (2010)

Proceedings of the National Academy of Sciences of the United States of America 10.1073, 1-6 (2010) Advances in genome analysis, network biology, and computational chemistry have the potential to revolutionize drug discovery by combining system-level identification of drug targets with the atomistic modeling of small molecules capable of modulating their activity. To demonstrate the effectiveness of such a discovery pipeline, we deduced common antibiotic targets in Escherichia coli and Staphylococcus aureus by identifying shared tissue-specific or uniformly essential metabolic reactions in their metabolic networks. We then predicted through virtual screening dozens of potential inhibitors for several enzymes of these reactions and showed experimentally that a subset of these inhibited both enzyme activities in vitro and bacterial cell viability. This blueprint is applicable for any sequenced organism with high-quality metabolic reconstruction and suggests a general strategy for strain-specific antiinfective therapy.

Cancer metastasis networks and the prediction of progression patterns

L. L. Chen, N. Blumm, N. A. Christakis, A.-L. Barabási, T. S. Deisboeck

British Journal of Cancer 101, 749-758 (2009)

  • ABSTRACT