- Collaborative Content Creation using Large Databases
- Game-driven Mass-scale Content Creation
- Tools for Rapid Creation and Editing of Animation
Creating visual content is difficult. Consider any visual medium: photographs, video, 3D models, diagrams and illustrations, realistic 3D renderings, animation, slide presentations, webpage design, games, etc. The software tools for creating such visual media usually include sophisticated interfaces that provide precise and flexible control over the content. Yet, they are dauntingly difficult to learn to use effectively. Although cameras now allow anyone to easily capture photos and video, tools for manipulating such media and creating other forms of visual content remain accessible only to experts. We believe that two recent trends offer the promise of enabling a significantly wider set of everyday users (of all skill levels) to produce visual content:
Collaboration via the Internet:
The World Wide Web allows people to work together and leverage complimentary skills to collaboratively create content. Already content sharing sites like Flickr and YouTube allow people to add photos and video to their Visual projects. Crowd sourcing frameworks (e.g. Amazon Mechanical Turk, CrowdFlower etc.) allow requestors to post content creation tasks and that other workers can fulfill for small payments. Multi–‐player games provide other types of incentives to create content. We believe that such collaboration will fundamentally change the way people create visual content.
New Sensing Hardware for Input:
Mobile devices have recently made video cameras, microphones, and multi-touch screens commonplace. Similarly gaming systems are poised to make 3D cameras and accelerometers similarly cheap and ubiquitous. Such sensors enable simple, direct, gestural Interaction and can provide greater recognition of context. We believe that developing content creation interfaces that take advantage of such hardware will increase accessibility of these tools to and much wider set of users.
Communication is fundamental to human existence. It is clear that we can communicate at far greater data rates using visual content vs. language, i.e., a picture tells a thousand words. The current difficulty of creating visual content severely limits our ability to express ourselves visually. Simplifying content creation will enable the effortless creation of sophisticated visual content such as animations and videos and use them to express information and ideas at a much higher comprehension rate than you could using the spoken or written word, especially across cultural and/or language divides. Great communicators use language to effectively communicate visions. If we could all create such visions on our computers, we would all become great communicators
- Systems for Embedded/Mobile Computational Photography
- Programmable pipelines for mainstream graphics: irregular and heterogeneous parallelism
- Internet-scale Visual Computing with Meru
- Large-Scale Distributed Image Analysis and Visualization
To create the future architectures needed to support advances such as those outlined in the above three themes, the researchers involved in this theme will explore next generation architectures and tools that address the following four crucial technology trends.
Personal computing is increasingly moving away from traditional desktop computers toward mobile devices, ranging from laptops to tablets to pocket-sized computers, phones, and other battery-powered devices. The result is a need to design systems that focus on mobility with an emphasis on power-aware design, miniaturization, and efficient computing given a minimal cost, power, and volume budget.
Just as computing is moving away from traditional PCs onto mobile devices, computing is also moving in the opposite direction, into the cloud, which can deliver superior reliability, cost, and scalability than the desktop.
GPU design has historically incorporated both fixed-function and programmable parts. While a recent trend toward more flexibility has increased the focus on programmable components, the superior power efficiency of fixed-function components merits their continued study. The likely result is heterogeneous graphics systems with heterogeneity at many levels: fine-grained (such as integer vs. floating point ALUs); medium-grained (such as rasterization or texture filtering units); and coarse-grained (CPU cores vs. GPU cores). Determining the right mix of units, and programming the resulting heterogeneous systems, is one of the grand systems design challenges in computing.
Finally, at all levels of computing from mobile to desktop to the cloud, we see a growing gap between the capabilities of the hardware—what the system could do—and the delivered performance of the Software—what the system actually does. As the hardware becomes more complex, more parallel, and more heterogeneous, we see a real and growing need for solving the programmability problem by building software that allows programmers and users to make the most of the hardware.
These trends will affect designs, implementations, and software support for future graphics systems of all sizes, from small, inexpensive mobile systems to traditional single-node GPUs to visual computing in the cloud.
PERCEIVING PEOPLE AND PLACES
- Home Modeling and Remodeling
- Semantic Modeling of Urban Environments
- Parsing the Appearance of People in Images
- Recognizing Human Actions in Videos
The proliferation of digital cameras, coupled with explosive progress in computer vision, has led to major breakthroughs in sensing technologies. We now use these technologies in our everyday lives—in cameras, maps, and search—with many more uses on the way (cars, personal robotics, smart homes, etc.). These advances are due in large part to the recent development of extremely accurate and robust low-level computer vision algorithms for feature detection, matching, and 3D measurement. The next wave of breakthroughs will be defined by the ability to infer high-level functional and semantic information about people and environments from images and video. Beyond telling you if a person is present, next generation systems will reliably perceive who it is and what that person is doing, down to the level of actions and activities. Such capabilities will be transformational across a wide spectrum of applications—using your body to replace the mouse (as in Microsoftʼs upcoming Kinect/ Project Natal game console) is just the tip of the iceberg. Similarly, next generation 3D vision systems will go beyond raw depth measurement, to also perceive the functional and semantic content of the scene. While current 3D modeling methods represent the scene as an unorganized mass of points or triangles, next generation systems will recognize what is in the scene—doors, chairs, stairs, sidewalks, windows, tables, and other components. Beyond scene visualization, these new capabilities will enable applications such as home remodeling pre-visualization and building searchable, functional 3D city models from online imagery and LIDAR data that you can interact with in an online virtual world.
SCALABLE REAL-TIME SIMULATION
- Scalable Visual Rendering
- Integrated Physics Software
- Physics-Based Sound Rendering
- Simulated Virtual Characters
The simulation of physical phenomena is central to visual computing: from light transport and appearance, to the dynamics of fluids and solids, the movement of virtual characters, and increasingly the sounds these systems make. Such multi-sensory simulations are notoriously expensive to compute making it difficult to realize real-time simulations and develop simulation applications for mobile computing. This theme will address physics-based simulation and multi-sensory rendering in an integrated and connected manner: light, motion and sound. Research will be focused on the unique challenges posed by simulated virtual characters. In addition, this theme will explore the computational challenges of scalable simulation as it relates to multi-physics and multi-sensory software integration, parallel and distributed computing, interactive and hard real-time computation, model complexity, e.g., planetary-scale simulation, the spectrum of computing platforms (from mobile to the cloud), and ever-present memory and bandwidth concerns.