Today, significant gaps exist between the data we wish to understand and the computing capability necessary to derive that understanding. These manifest themselves technically as problems in algorithms, architecture, and programming models. Big data analytics and machine learning are critical to competitiveness in industry, for national security, and for scientific exploration. We see four classes of data understanding gaps:
At the national level, assimilating information across disparate domains (e.g., intelligence, economics, science) can provide improved capabilities for decision makers. In industry, analytics-driven enterprises have moved into the main-stream, and traditional enterprises are increasingly moving towards an analytics-driven approach for their core business functions. Cybersecurity problems impact governments and corporations equally, with General Alexander estimating that it costs the economy $1T annually. Given this environment, two trends for national competitiveness are clear:
Successfully addressing these two concerns is critical to ensuring ongoing US competitiveness. The research firm Gartner predicted in December 2011 that 85% of Fortune 500 companies will be unprepared to leverage big data for a competitive advantage by 2015[1]. Similarly, the President’s Council of Advisors on Science and Technology concluded that investment in large-scale data management and analytics is a critically important strategic investment area[2], which is reflected in current federal investment plans. Today we tend to think of “cloud computing” and “high performance computing” as being two disparate things, however large-scale ad-hoc, complex analytics problems are not easily addressed in the could, and HPC tends not to be able to address non-compute-oriented problems. What is needed is a more advanced common solution that optimally exploits the strengths of these different architectural approaches. We propose to take initial steps to:
This workshop will bring together leaders in analytics from big data companies, consumers, the government, and academia with the theme of identifying the gaps between the kinds of analytics we would like to perform and today’s compute capabilities. The goal is to better understand which computing problems would best motivate an improvement in technology and infrastructure and how government and industry requirements may better align in the future.
Organizing Committee:
Candy Culhane (DoD)
John Feo (PNNL)
Moe Khaleel (PNNL)
Rob Leland (SNL)
Dave Mountain (DoD)
Richard Murphy (SNL)
Steve Pritchard (DoD)
[1] http://www.gartner.com/it/page.jsp?id=1862714
[2] Designing a Digital Future: Federally Funded Research in Networking and Information Technology, December 2010.
- Scalability Gaps where algorithms exist today but cannot withstand a growth in data;
- Productivity Gaps in which algorithms and data may exist, but deploying those algorithms and producing meaningful results is a challenge;
- Fundamental Science and Mathematics Gaps as long-standing challenge problems with no known solution or technique, but which pose significant risks if unaddressed; and,
- Direction and Leadership Gaps that arise from a lack of coordination between numerous potential existing solutions, all of which exist in isolation.
At the national level, assimilating information across disparate domains (e.g., intelligence, economics, science) can provide improved capabilities for decision makers. In industry, analytics-driven enterprises have moved into the main-stream, and traditional enterprises are increasingly moving towards an analytics-driven approach for their core business functions. Cybersecurity problems impact governments and corporations equally, with General Alexander estimating that it costs the economy $1T annually. Given this environment, two trends for national competitiveness are clear:
- The government’s interest in more capable analytics aligns with industry’s interest in the same technology. Data intensive computing may have a significantly larger market than traditional HPC. Investment in advanced analytics capabilities could potentially be more broadly leveraged than any prior government investments in computing.
- We have entered an era of exponential growth in data, which exceeds the growth in compute capability. As a result, both “large-scale” computing and rethinking the overall compute infrastructure in the context of data analytics are required.
Successfully addressing these two concerns is critical to ensuring ongoing US competitiveness. The research firm Gartner predicted in December 2011 that 85% of Fortune 500 companies will be unprepared to leverage big data for a competitive advantage by 2015[1]. Similarly, the President’s Council of Advisors on Science and Technology concluded that investment in large-scale data management and analytics is a critically important strategic investment area[2], which is reflected in current federal investment plans. Today we tend to think of “cloud computing” and “high performance computing” as being two disparate things, however large-scale ad-hoc, complex analytics problems are not easily addressed in the could, and HPC tends not to be able to address non-compute-oriented problems. What is needed is a more advanced common solution that optimally exploits the strengths of these different architectural approaches. We propose to take initial steps to:
- Define the initial set of metrics appropriate to a “big data” system
- Describe challenge problems that stress the boundaries of today’s systems
- Look for algorithmic techniques and general approximations that can help convert the worst of these problems (the ones that are notionally worse than linear) into solvable approximations
- Look for the "tall poles" in the performance tent when the best of the algorithms are run on modern systems, so that alternative architectures can be proposed and evaluated
This workshop will bring together leaders in analytics from big data companies, consumers, the government, and academia with the theme of identifying the gaps between the kinds of analytics we would like to perform and today’s compute capabilities. The goal is to better understand which computing problems would best motivate an improvement in technology and infrastructure and how government and industry requirements may better align in the future.
Organizing Committee:
Candy Culhane (DoD)
John Feo (PNNL)
Moe Khaleel (PNNL)
Rob Leland (SNL)
Dave Mountain (DoD)
Richard Murphy (SNL)
Steve Pritchard (DoD)
[1] http://www.gartner.com/it/page.jsp?id=1862714
[2] Designing a Digital Future: Federally Funded Research in Networking and Information Technology, December 2010.
Conference Agenda
agenda.pdf | |
File Size: | 93 kb |
File Type: |
Your browser does not support viewing this document. Click here to download the document.