*Names in bold indicate Presenter
In modern economics, growth is endogenously related to technological change through the investment decisions of profit-maximizing economic agents (Romer 1990): the more human capital is devoted to successful research in science and engineering, the greater economic growth ensues. The effective management of research investments is therefore crucial to economic growth and ultimately to improve quality of life. Because the benefits of research come largely in the future, it is difficult to use standard return-on-investment metrics, such as cash flow analysis, to assess the performance of research investment portfolios. Rather, decision-making in science policy requires a systematic understanding of the factors that promote/hinder the scientific and societal impacts of research in science and engineering. Scientometrics has made pivotal contributions to such an endeavor through the provision of bibliometric measures that quantify the productivity and impact of published work and scientific collaboration. However, a Science of Science Policy ultimately requires analytic models that combine bibliometrics with additional measurable project outputs to provide comprehensive insights about the overall outcomes of research in science and engineering. The aim of this paper is to develop a Predictive Scientometric Analytics framework that implements such a Science of Science Policy vision to help policymakers manage investments in science through the analysis and forecasting of scientific and societal impacts of research in science and engineering.
We apply knowledge discovery techniques to Laboratory Directed Research and Development project data from a U.S. national laboratory to build models of research impact that combine information about scientific publications, the creation of intellectual property, receipt of awards, growth of the scientific workforce, establishment of collaborations, and generation of new funding. Our objective is the creation of data-driven computational models that can be used to monitor, predict and change the course of ongoing research projects.
We use machine learning techniques to learn classification models of project attainment from records of project funding, aims and achievements. We show how project records of funds received, aims proposed and outputs achieved can be used as training materials to develop analytic and predictive computational models of research impact, using data mining techniques. The evaluation of the models developed for each of the five dimensions of focus – publications, intellectual property, awards, employment, collaboration and sales – and their combination into an overall measure of project impact demonstrate the viability and effectiveness of the methodology.
The emerging classification models provide an understanding of which factors contribute to project impact and the extent of the contribution. By artificially manipulating the contributing factors – e.g. adding funding, fostering outside collaboration, increasing the rate of publications or the production of intellectual property – it is possible to forecast possible developments of a project under diverse management conditions.
We conclude by outlining how the models of project impact developed through the methodology presented can be utilized to power a decision-support application that helps policymakers manage investments in science and innovation.