A new whitepaper that Microsoft researchers are set to current at a conference next month sheds a lot more light on Microsoft;s back-end cloud infrastructure.The paper, entitled, “SCOPE: Easy and Efficient Parallel Processing of Massive Data Sets,” details a new declarative scripting language that is optimized for storing and analyzing massive data sets (like search logs and click streams) that are key to cloud-scale service architectures. SCOPE, or Structure Computations Optimized for Parallel Execution, is the name of the language.According to the paper — which Microsoft is on tap to current at the VLDB 2008 conference in late August — SCOPE doesn;t require explicit parallelism,
Office Pro 2007, but it will be “amenable to efficient parallel execution” across large clusters. SCOPE is like SQL, but with C# extensions, the paper says.I found the new whitepaper via a blog link from Greg Linden, an employee of Microsoft;s Live Labs. Linden blogged:“Scope is similar to Yahoo;s Pig, which is a higher level language on top of Hadoop, or Google;s Sawzall,
Buy Office 2007, which is a higher level language on top of MapReduce. But,
Office 2007 Pro Plus, where Pig focuses on and advocates a more imperative programming style, Scope looks a lot more like SQL.”Reading through the paper, I noticed an explanation of how SCOPE fits in with Cosmos, Microsoft;s back-end storage layer that currently powers Live Search and other Microsoft services. The SCOPE whitepaper sheds more light on what Cosmos is and how it works. From the paper:“Microsoft has developed a distributed computing platform,
Office Pro Plus 2010 Key, called Cosmos,
Office 2007 Keygen, for storing and analyzing massive data sets. Cosmos is designed to run on large clusters consisting of thousands of commodity servers. Disk storage is distributed with each server having one or additional direct-attached disks.”(A loosely-coupled aside: I wonder if Pat Helland;s decision to move to the SQL team at Microsoft has any connection to all of this. Helland;s expertise is in big-picture strategy around transactional and parallel processing, as well as service-oriented architectures.)Increasingly, all of Microsoft;s future strategies and products finally seem to be converging. A lot more teams are thinking about parallel/distributed/multicore computing, with the experimental Windows successor code-named Midori being just the most recent of many examples. A lot more Microsoft products are seemingly being designed with modeling in mind from the get-go.Maybe Chief Software Architect Ray Ozzie;s campaign to break “drive alignment” across the various Microsoft product groups is finally taking root…. Or maybe it;s simply that cloud computing, to be truly scalable, must be built to work across increasingly large networks of distributed systems. Or maybe it;s a little of both….