Does microbiology of the built environment have any Big Data issues? If so …. $$$$

Just was sent this by our grants office: – Funding – Critical Techniques and Technologies for Advancing Foundations and Applications of Big Data Science & Engineering – US National Science Foundation (NSF).

Seems like this may be of interest to folks working on microbiology of the built environment as there are some serious Big Data challenges here.

Summary text is below:

The BIGDATA program seeks novel approaches in computer science, statistics, computational science, and mathematics, along with innovative applications in domain science, including social and behavioral sciences, geosciences, education, biology, the physical sciences, and engineering that lead towards the further development of the interdisciplinary field of data science.  The solicitation invites two types of proposals: “Foundations” (F): those developing or studying fundamental theories, techniques, methodologies, technologies of broad applicability to Big Data problems; and “Innovative Applications” (IA): those developing techniques, methodologies and technologies of key importance to a Big Data problem directly impacting at least one specific application.  Therefore, projects in this category must be collaborative, involving researchers from domain disciplines and one or more methodological disciplines, e.g., computer science, statistics, mathematics, simulation and modeling, etc. While Innovative Applications (IA) proposals may address critical big data challenges within a specific domain, a high level of innovation is expected in all proposals and proposals should, in general, strive to provide solutions with potential for a broader impact on data science and its applications. IA proposals may focus on novel theoretical analysis and/or on experimental evaluation of techniques and methodologies within a specific domain. Proposals in all areas of sciences and engineering covered by participating directorates at NSF are welcome.

While notions of volume, velocity, and variety are commonly ascribed to big data problems, other key issues include data quality and provenance. Data-driven solutions must carefully ascribe quality and provenance to results in a manner that is helpful to the users of the results. For example, in some cases, such as in education research, data quality may aggregate to test or measurement instrument quality, where a composite of variables may be used to describe one or more constructs.

In addition to approaches such as search, query processing, and analysis, visualization techniques will also become critical across many stages of big data use–to obtain an initial assessment of data as well as through subsequent stages of scientific discovery. Research on visualization techniques and models will be necessary for serving not only the experts, who are collecting the data, but also those who are users of the data, including “cross-over” scientists who may be working with big data and analytics for the first time, and those using the data for teaching at the undergraduate and graduate levels. The BIGDATA program seeks novel approaches related to all of these areas of study.

Before preparing a proposal in response to this BIGDATA solicitation, applicants are strongly urged to consult the list of related solicitations available at: and consult the respective NSF program officers listed in them should those solicitations be more appropriate.  In particular, applicants interested in deployable cyberinfrastructure pilots that would support a broader research community should see the Campus Cyberinfrastructure – Data, Networking, and Innovation Program (CC*DNI) solicitation ( Applicants should also consider the Computational and Data Enabled Science and Engineering (CDS&E, PD 12-8084) ( and Exploiting Parallelism and Scalability (XPS, NSF 15-511) ( solicitations for potential fit.


One thought on “Does microbiology of the built environment have any Big Data issues? If so …. $$$$

  1. Definately, I mean, we have tons of databases, genome resources and all kinds of omics data available. But is there a simple way of interconnecting this information, ask scientific questions about it and analyse it? The big data stores out there nowadays only do static storing, once uploaded it is there for life but it is also static for life. How many annotated genomes are out there that are more then a year old and have never been looked at it again? To much in my opinion, and if you want to use them how trustworthy is it?

Leave a Reply

Jonathan Eisen

I am an evolutionary biologist and a Professor at U. C. Davis. My lab is in the UC Davis Genome Center and I hold appointments in the Department of Medical Microbiology and Immunology in the School of Medicine and the Department of Evolution and Ecology in the College of Biological Sciences. My research focuses on the origin of novelty (how new processes and functions originate). To study this I focus on sequencing and analyzing genomes of organisms, especially microbes and using phylogenomic analysis (see my lab site here which has more information on lab activities).  In addition to research, I am heavily involved in the Open Access publishing and Open Science movements.