US20130321413A1 - Video generation using convict hulls - Google Patents

Video generation using convict hulls Download PDF

Info

Publication number
US20130321413A1
US20130321413A1 US13/790,158 US201313790158A US2013321413A1 US 20130321413 A1 US20130321413 A1 US 20130321413A1 US 201313790158 A US201313790158 A US 201313790158A US 2013321413 A1 US2013321413 A1 US 2013321413A1
Authority
US
United States
Prior art keywords
planes
series
scene
contours
data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Abandoned
Application number
US13/790,158
Inventor
Patrick Sweeney
Don Gillett
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Microsoft Technology Licensing LLC
Original Assignee
Microsoft Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Microsoft Corp filed Critical Microsoft Corp
Priority to US13/790,158 priority Critical patent/US20130321413A1/en
Assigned to MICROSOFT CORPORATION reassignment MICROSOFT CORPORATION ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: GILLETT, DON, SWEENEY, PATRICK
Publication of US20130321413A1 publication Critical patent/US20130321413A1/en
Assigned to MICROSOFT TECHNOLOGY LICENSING, LLC reassignment MICROSOFT TECHNOLOGY LICENSING, LLC ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: MICROSOFT CORPORATION
Abandoned legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T15/003D [Three Dimensional] image rendering
    • G06T15/04Texture mapping
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T15/003D [Three Dimensional] image rendering
    • G06T15/08Volume rendering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T15/003D [Three Dimensional] image rendering
    • G06T15/10Geometric effects
    • G06T15/20Perspective computation
    • G06T15/205Image-based rendering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T17/00Three dimensional [3D] modelling, e.g. data description of 3D objects
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N13/00Stereoscopic video systems; Multi-view video systems; Details thereof
    • H04N13/10Processing, recording or transmission of stereoscopic or multi-view image signals
    • H04N13/106Processing image signals
    • H04N13/111Transformation of image signals corresponding to virtual viewpoints, e.g. spatial image interpolation
    • H04N13/117Transformation of image signals corresponding to virtual viewpoints, e.g. spatial image interpolation the virtual viewpoint locations being selected by the viewers or determined by viewer tracking
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N13/00Stereoscopic video systems; Multi-view video systems; Details thereof
    • H04N13/10Processing, recording or transmission of stereoscopic or multi-view image signals
    • H04N13/194Transmission of image signals
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N13/00Stereoscopic video systems; Multi-view video systems; Details thereof
    • H04N13/20Image signal generators
    • H04N13/204Image signal generators using stereoscopic image cameras
    • H04N13/239Image signal generators using stereoscopic image cameras using two 2D image sensors having a relative position equal to or related to the interocular distance
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N13/00Stereoscopic video systems; Multi-view video systems; Details thereof
    • H04N13/20Image signal generators
    • H04N13/204Image signal generators using stereoscopic image cameras
    • H04N13/243Image signal generators using stereoscopic image cameras using three or more 2D image sensors
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N13/00Stereoscopic video systems; Multi-view video systems; Details thereof
    • H04N13/20Image signal generators
    • H04N13/204Image signal generators using stereoscopic image cameras
    • H04N13/246Calibration of cameras
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N13/00Stereoscopic video systems; Multi-view video systems; Details thereof
    • H04N13/20Image signal generators
    • H04N13/257Colour aspects
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N7/00Television systems
    • H04N7/14Systems for two-way working
    • H04N7/141Systems for two-way working between two video terminals, e.g. videophone
    • H04N7/142Constructional details of the terminal equipment, e.g. arrangements of the camera and the display
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N7/00Television systems
    • H04N7/14Systems for two-way working
    • H04N7/15Conference systems
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N7/00Television systems
    • H04N7/14Systems for two-way working
    • H04N7/15Conference systems
    • H04N7/157Conference systems defining a virtual conference space and using avatars or agents
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2210/00Indexing scheme for image generation or computer graphics
    • G06T2210/56Particle system, point based geometry or rendering
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04RLOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
    • H04R2227/00Details of public address [PA] systems covered by H04R27/00 but not provided for in any of its subgroups
    • H04R2227/005Audio distribution systems for home, i.e. multi-room use
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04SSTEREOPHONIC SYSTEMS 
    • H04S2400/00Details of stereophonic systems covered by H04S but not provided for in its groups
    • H04S2400/15Aspects of sound capture and related signal processing for recording or reproduction

Definitions

  • a given video generally includes one or more scenes, where each scene in the video can be either relatively static (e.g., the objects in the scene do not substantially change or move over time) or dynamic (e.g., the objects in the scene substantially change and/or move over time).
  • polygonal modeling is commonly used to represent three-dimensional objects in a scene by approximating the surface of each object using polygons.
  • a polygonal model of a given scene includes a collection of vertices. Two neighboring vertices that are connected by a straight line form an edge in the polygonal model. Three neighboring and non-co-linear vertices that are interconnected by three edges form a triangle in the polygonal model.
  • a polygonal/mesh model of a scene includes a collection of vertices, edges and polygonal (i.e., polygon-based) faces that represents/approximates the shape of each object in the scene.
  • Video generation technique embodiments described herein are generally applicable to generating a video of a scene and presenting it to a user.
  • one or more streams of sensor data that represent the scene are received.
  • Scene proxies are then generated from the streams of sensor data.
  • This scene proxies generation includes the following actions.
  • a stream of mesh models of the scene is generated from the streams of sensor data.
  • the following actions take place.
  • the mesh model is sliced using a series of planes that are parallel to each other, where each of the planes in the series defines one or more contours each of which defines a specific region on the plane where the mesh model intersects the plane.
  • a texture map is then generated for the mesh model which defines texture data corresponding to each of the contours that is defined by the series of planes.
  • the scene proxies are received.
  • the scene proxies include a stream of mathematical equations describing contours that are defined by a series of planes that are parallel to each other, and a stream of texture maps defining texture data corresponding to each of the contours that is defined by the series of planes.
  • Images of the scene are then rendered from the scene proxies and displayed. This image rendering includes the following actions.
  • the series of planes is constructed using data specifying the spatial orientation and geometry of the series of planes.
  • the contours that are defined by the series of planes are then constructed using the stream of mathematical equations.
  • a series of point locations is then constructed along each of the contours that is defined by the series of planes, where this construction is performed in a prescribed order across each of the planes in the series of planes, and this construction is also performed starting from a prescribed zero position on each of these contours.
  • the point locations that are defined by the series of planes are then tessellated, where this tessellation generates a stream of polygonal models, and each polygonal model includes a collection of polygonal faces that are formed by neighboring point locations on corresponding contours on neighboring planes in the series of planes.
  • the stream of texture maps is then sampled to identify the texture data that corresponds to each of the polygonal faces in the stream of polygonal models. This identified texture data is then used to add texture to each of the polygonal faces in the stream of polygonal models.
  • FIG. 1 is a diagram illustrating an exemplary embodiment, in simplified form, of a video processing pipeline for implementing the video generation technique embodiments described herein.
  • FIG. 2 is a flow diagram illustrating an exemplary embodiment, in simplified form, of a process for generating a video of a scene.
  • FIG. 3 is a flow diagram illustrating an exemplary embodiment, in simplified form, of a process for generating scene proxies that geometrically describe the scene as a function of time.
  • FIGS. 4A-4C are diagrams illustrating an exemplary embodiment, in simplified form, of a series of three planes that are parallel to each other and are used to slice a mesh model of a human.
  • FIG. 5 is a flow diagram illustrating an exemplary embodiment, in simplified form, of a process for generating a texture map for a given mesh model in a stream of mesh models of the scene.
  • FIG. 6 is a diagram illustrating an exemplary embodiment, in simplified form, of a contour and three-dimensional that are identified along the contour.
  • FIG. 7 is a diagram illustrating an exemplary embodiment, in simplified form, of a contour point map.
  • FIG. 8 is a flow diagram illustrating an exemplary embodiment, in simplified form, of a process for storing the scene proxies.
  • FIG. 9 is a flow diagram illustrating an exemplary embodiment, in simplified form, of a process for distributing the scene proxies to an end user who either is, or will be, viewing the video.
  • FIG. 10 is a flow diagram illustrating an exemplary embodiment, in simplified form, of a process for presenting the video of the scene to an end user.
  • FIG. 11 is a flow diagram illustrating an exemplary embodiment, in simplified form, of a process for rendering images of the scene from the scene proxies.
  • FIG. 12 is a flow diagram illustrating an exemplary embodiment, in simplified form, of a process for assigning texels in a given scanline of the texture map to contours that are defined by a given plane in a series of planes that are used to slice the mesh model.
  • FIG. 13 is a diagram illustrating a simplified example of a general-purpose computer system on which various embodiments and elements of the video generation technique, as described herein, may be implemented.
  • FIG. 14 is a diagram illustrating a simplified example of the assignment of the texels in two neighboring scanlines of an exemplary texture map to exemplary contours that are defined by two neighboring planes that correspond to the two neighboring scanlines.
  • FIG. 15 is a flow diagram illustrating an exemplary embodiment, in simplified form, of a process for sampling a stream of texture maps to identify texture data that corresponds to polygonal faces in a stream of polygonal models that is generated from a stream of mathematical equations describing contours that are defined by the series of planes.
  • the Visible Human Project was conceived in the late 1980s and run by the U.S. National Library of Medicine.
  • the goal of the Project was to create a detailed human anatomy data set using cross-sectional photographs of the human body in order to facilitate anatomy visualization applications.
  • a convicted murderer named Joseph Paul Jernigan was executed in 1993 and his cadaver was used to provide male data for the Project. More particularly, Jernigan's cadaver was encased and frozen in a gelatin and water mixture in order to stabilize the cadaver for cutting thereof.
  • Jernigan's cadaver was then segmented (i.e., “cut”) along its axial plane (also known as its transverse plane) at one millimeter intervals from the top of the cadaver's scalp to the soles of its feet, resulting in 1,871 “slices”. Each of these slices was photographed and digitized at a high resolution.
  • the term “convict hull” is accordingly used herein to refer to a given plane that is used to “slice” a given mesh model of a given scene.
  • the term “sensor” is used herein to refer to any one of a variety of scene-sensing devices which can be used to generate a stream of sensor data that represents a given scene.
  • the video generation technique embodiments described herein employ one or more sensors which can be configured in various arrangements to capture a scene, thus allowing one or more streams of sensor data to be generated each of which represents the scene from a different geometric perspective.
  • Each of the sensors can be any type of video capture device (e.g., any type of video camera), or any type of audio capture device (such as a microphone, or the like), or any combination thereof.
  • Each of the sensors can also be either static (i.e., the sensor has a fixed spatial location and a fixed rotational orientation which do not change over time), or moving (i.e., the spatial location and/or rotational orientation of the sensor change over time).
  • the video generation technique embodiments described herein can employ a combination of different types of sensors to capture a given scene.
  • the video generation technique embodiments described herein generally involve using convict hulls to generate a video of a given scene and then present the video to one or more end users.
  • the video generation technique embodiments support the generation, storage, distribution, and end user presentation of any type of video.
  • one embodiment of the video generation technique supports various types of traditional, single viewpoint video in which the viewpoint of the scene is chosen by the director when the video is recorded/captured and this viewpoint cannot be controlled or changed by an end user while they are viewing the video.
  • a single viewpoint video the viewpoint of the scene is fixed and cannot be modified when the video is being rendered and displayed to an end user.
  • Another embodiment of the video generation technique supports various types of free viewpoint video in which the viewpoint of the scene can be interactively controlled and changed by an end user at will while they are viewing the video.
  • a free viewpoint video an end user can interactively generate synthetic (i.e., virtual) viewpoints of the scene on-the-fly when the video is being rendered and displayed.
  • synthetic viewpoints of the scene on-the-fly when the video is being rendered and displayed.
  • the video generation technique embodiments described herein are advantageous for various reasons including, but not limited to, the following.
  • the video generation technique embodiments serve to minimize the size of (i.e., minimize the amount of data in) the video that is generated, stored and distributed. Based on this video size/data minimization, it will also be appreciated that the video generation technique embodiments minimize the cost and maximize the performance associated with storing and transmitting the video in a client-server framework where the video is generated and stored on a server computing device, and then transmitted from the server over a data communication network to one or more client computing devices upon which the video is rendered and then viewed and navigated by the one or more end users.
  • the video generation technique embodiments maximize the photo-realism of the video that is generated when it is rendered and then viewed and navigated by the end users.
  • the video generation technique embodiments provide the end users with photo-realistic video that is free of discernible artifacts, thus creating a feeling of immersion for the end users and enhancing their viewing experience.
  • the video generation technique embodiments described herein eliminate having to constrain the complexity or composition of the scene that is being captured (e.g., neither the environment(s) in the scene, nor the types of objects in the scene, nor the number of people of in the scene, among other things has to be constrained). Accordingly, the video generation technique embodiments are operational with any type of scene, including both relatively static and dynamic scenes. The video generation technique embodiments also provide a flexible, robust and commercially viable method for generating a video, and then presenting it to one or more end users, that meets the needs of today's various creative video producers and editors.
  • the video generation technique embodiments are applicable to various types of video-based media applications such as consumer entertainment (e.g., movies, television shows, and the like) and video-conferencing/telepresence, among others.
  • consumer entertainment e.g., movies, television shows, and the like
  • video-conferencing/telepresence among others.
  • FIG. 1 illustrates an exemplary embodiment, in simplified form, of a video processing pipeline for implementing the video generation technique embodiments described herein.
  • the video generation technique embodiments support the generation, storage, distribution, and end user presentation of any type of video including, but not limited to, various types of single viewpoint video and various types of free viewpoint video.
  • the video processing pipeline 100 start with a generation stage 102 during which, and generally speaking, scene proxies of a given scene are generated.
  • the generation stage 102 includes a capture sub-stage 104 and a processing sub-stage 106 whose operation will now be described in more detail.
  • the capture sub-stage 104 of the video processing pipeline 100 generally captures the scene and generates one or more streams of sensor data that represent the scene. More particularly, in an embodiment of the video generation technique described herein where a single viewpoint video is being generated, stored, distributed and presented to one or more end users (hereafter simply referred to as the single viewpoint embodiment of the video generation technique), during the capture sub-stage 104 a single sensor is used to capture the scene, where the single sensor includes a video capture device and generates a single stream of sensor data which represents the scene from a single geometric perspective. The stream of sensor data is received from the sensor and then output to the processing sub-stage 106 .
  • an arrangement of sensors is used to capture the scene, where the arrangement includes a plurality of video capture devices and generates a plurality of streams of sensor data each of which represents the scene from a different geometric perspective. These streams of sensor data are received from the sensors and calibrated, and then output to the processing sub-stage 106 .
  • the processing sub-stage 106 of the video processing pipeline 100 receives the stream(s) of sensor data from the capture sub-stage 104 , and then generates scene proxies that geometrically describe the captured scene as a function of time from the stream(s) of sensor data. The scene proxies are then output to a storage and distribution stage 108 .
  • the storage and distribution stage 108 of the video processing pipeline 100 receives the scene proxies from the processing sub-stage 106 , stores the scene proxies, outputs the scene proxies and distributes them to one or more end users who either are, or will be, viewing the video, or both.
  • this distribution takes place by transmitting the scene proxies over whatever one or more data communication networks the end user computing devices are connected to. It will be appreciated that this transmission is implemented in a manner that meets the needs of the specific implementation of the video generation technique embodiments and the related type of video that is being processed in the pipeline 100 .
  • the end user presentation stage 110 of the video processing pipeline 100 receives the scene proxies that are output from the storage and distribution stage 108 , and then presents each of the end users with a rendering of the scene proxies.
  • the end user presentation stage 110 includes a rendering sub-stage 112 and a user viewing experience sub-stage 114 whose operation will now be described in more detail.
  • the rendering sub-stage 112 of the video processing pipeline 100 receives the scene proxies that are output from the storage and distribution stage 108 , and then renders images of the captured scene from the scene proxies, where these images have a fixed viewpoint that cannot be modified by an end user.
  • the fixed viewpoint images of the captured scene are then output to the user viewing experience sub-stage 114 of the pipeline 100 .
  • the user viewing experience sub-stage 114 receives the fixed viewpoint images of the captured scene from the rendering sub-stage 112 , and then displays these images on a display device for viewing by a given end user.
  • the user viewing experience sub-stage 114 can provide the end user with the ability to interactively temporally navigate/control the single viewpoint video at will, and based on this temporal navigation/control the rendering sub-stage 112 will either temporally pause/stop, or rewind, or fast forward the single viewpoint video accordingly.
  • the rendering sub-stage 112 receives the scene proxies that are output from the storage and distribution stage 108 , and then renders images of the captured scene from the scene proxies, where these images have a synthetic viewpoint that can be modified by an end user.
  • the synthetic viewpoint images of the captured scene are then output to the user viewing experience sub-stage 114 .
  • the user viewing experience sub-stage 114 receives the synthetic viewpoint images of the captured scene from the rendering sub-stage 112 , and then displays these images on a display device for viewing by a given end user.
  • the user viewing experience sub-stage 114 can provide the end user with the ability to spatio/temporally navigate/control the synthetic viewpoint images of the captured scene on-the-fly at will.
  • the user viewing experience sub-stage 114 can provide the end user with the ability to continuously and interactively navigate/control their viewpoint of the images of the scene that are being displayed on the display device, and based on this viewpoint navigation the rendering sub-stage 112 will modify the images of the scene accordingly.
  • the user viewing experience sub-stage 114 can also provide the end user with the ability to interactively temporally navigate/control the free viewpoint video at will, and based on this temporal navigation/control the rendering sub-stage 112 will either temporally pause/stop, or rewind, or fast forward the free viewpoint video accordingly.
  • the video generation technique embodiments described herein generally employ one or more sensors which can be configured in various arrangements to capture a scene. These one or more sensors generate one or more streams of sensor data each of which represents the scene from a different geometric perspective.
  • FIG. 2 illustrates an exemplary embodiment, in simplified form, of a process for generating a video of a scene.
  • the process starts in block 200 with receiving the one or more streams of sensor data that represent the scene.
  • Scene proxies are then generated from these streams of sensor data (block 202 ), where the scene proxies geometrically describe the scene as a function of time.
  • the scene proxies can then be stored (block 204 ).
  • the scene proxies can also be distributed to the end user (block 206 ).
  • FIG. 3 illustrates an exemplary embodiment, in simplified form, of a process for generating the scene proxies from the one or more streams of sensor data that represent the scene.
  • the process starts in block 300 with generating a stream of mesh models of the scene from the streams of sensor data, where each of the mesh models includes a collection of vertices and a collection of polygonal faces that are formed by the vertices.
  • the following actions then take place for each of the mesh models (block 302 ).
  • the mesh model is sliced using a series of planes that are parallel to each other, where each of the planes in the series of planes defines one or more contours each of which defines a specific region on the plane where the mesh model intersects the plane (block 304 ).
  • a texture map for the mesh model is then generated, where the texture map defines texture data corresponding to each of the contours that is defined by the series of planes (block 306 ). It will be appreciated that this texture map can be generated using various methods, one example of which is described in more detail hereafter.
  • FIGS. 4A-4C illustrate an exemplary embodiment, in simplified form, of a series of three planes that are parallel to each other and are used to slice a mesh model of a human.
  • the series of planes 400 / 402 / 404 has a horizontal spatial orientation.
  • the top-most plane 400 in the series of planes slices the mesh model 406 substantially along the shoulder region of the human and defines a single contour 408 along this region.
  • FIG. 4A illustrates an exemplary embodiment, in simplified form, of a series of three planes that are parallel to each other and are used to slice a mesh model of a human.
  • the series of planes 400 / 402 / 404 has a horizontal spatial orientation.
  • the top-most plane 400 in the series of planes slices the mesh model 406 substantially along the shoulder region of the human and defines a single contour 408 along this region.
  • the middle plane 402 in the series of planes slices the mesh model 406 substantially along the right bicep region, chest region and left bicep region of the human and defines two different contours 410 and 412 . More particularly, the middle plane 402 defines a contour 410 along the right bicep region of the human. The middle plane 402 defines another contour 412 along the chest and left bicep regions of the human. As exemplified in FIG. 4C , the bottom-most plane 404 in the series of planes slices the mesh model 406 substantially along the right elbow region, upper stomach region, right hand region, left forearm region, and left hand region of the human and defines three different contours 414 / 416 / 418 .
  • the bottom-most plane 404 defines a contour 414 along the right elbow region of the human.
  • the bottom-most plane 404 defines another contour 416 along the upper stomach, left hand and left forearm regions of the human.
  • the bottom-most plane 404 defines yet another contour 418 along the right hand region of the human.
  • FIG. 5 illustrates an exemplary embodiment, in simplified form, of a process for generating a texture map for a given mesh model which defines texture data corresponding to each of the contours that is defined by the series of planes.
  • the texture map includes a series of scanlines each of which corresponds to a different one of the planes in the series of planes, where each of the scanlines includes a series of texels, and the total number of texels in each of the scanlines is the same. It will be appreciated that any number of texels per scanline can be used. In an exemplary embodiment of the video generation technique described herein the total number of texels in each of the scanlines is 1024. As exemplified in FIG.
  • the process starts in block 500 with the following actions taking place for each of the planes in the series of planes.
  • Each of the contours that is defined by the plane is analyzed in a prescribed order across the plane to identify a series of point locations along the contour (block 502 ).
  • This analysis of each of the contours that is defined by the plane is performed starting from a prescribed zero position on the contour, where this zero position is the same for each of the contours that is defined by the plane.
  • successive point locations along the contour are separated by a prescribed distance which is measured along the contour, and the just-described order, distance and zero position are the same for each of the planes in the series of planes.
  • each of the point locations in the contour point map is represented by its two-dimensional coordinates on the particular plane on which it lies.
  • each of the point locations in the contour point map is represented by its angle from the point location that immediately precedes it on its contour.
  • the contour point map may be used for additional types of processing.
  • the one or more streams of sensor data that represent the scene are used to compute texture data for each of the texels that is in the texture map (block 512 ), and this computed texture data is entered into the texture map (block 514 ).
  • this texture data computation is performed in the following manner. For each of the texels that is in the texture map, a projective texture mapping method is used to sample each of the streams of sensor data that represent the scene and combine texture information from each of these samples to generate texture data for the texel.
  • the video generation technique embodiments described herein can compute various textures in order to maximize the photo-realism of the objects that are represented by the mesh models.
  • the texture data that is computed for each of the texels that is in the texture map can include one or more of color data, or specular highlight data, or transparency data, or reflection data, or shadowing data, among other things.
  • each of the scanlines in the texture map can optionally be replicated a prescribed number of times, where this number is greater than or equal to one (block 516 ).
  • this prescribed number of times is one
  • each of the scanlines in the texture map will be replicated once so that two neighboring scanlines in the texture map will correspond to each of the planes in the series of planes. It is noted that a trade-off exists in the selection of the number of times each of the scanlines in the texture map is replicated.
  • replicating each of the scanlines a larger number of times creates more texture resolution in the images of the scene that are rendered from the scene proxies and displayed to the end users, but can also increase the amount of storage and network communication resources that are used in the aforementioned storage and distribution stage of the video processing pipeline exemplified in FIG. 1 , and can also increase the amount of processing that is associated with the aforementioned rendering sub-stage of the video processing pipeline.
  • replicating each of the scanlines a smaller number of times creates less texture resolution in the images of the scene that are rendered, but can also decrease the amount of storage and network communication resources that are used in the storage and distribution stage, and can also decrease the amount of processing that is associated with the rendering sub-stage.
  • the contours that are defined by a given plane can be analyzed in any order across the plane.
  • the contours that are defined by each of the planes are analyzed in a left-to-right order across the plane.
  • the just-described zero position on the contour can be any position thereon as long as it is the same for each of the contours being analyzed. In an exemplary embodiment of the video generation technique this zero position is the left-most point on the contour. It is also noted that a trade-off exists in the selection of the distance which separates successive point locations along the contours that are defined by the series of planes.
  • using a smaller distance between successive point locations along the contours creates a finer approximation of each of the mesh models, but also increases the amount of processing that is associated with the aforementioned processing sub-stage of the video processing pipeline exemplified in FIG. 1 .
  • using a larger distance between successive point locations along the contours creates a coarser approximation of each of the mesh models, but also decreases the amount of processing that is associated with the processing sub-stage.
  • FIG. 6 illustrates an exemplary embodiment, in simplified form, of a contour and point locations that are identified along the contour.
  • a series of point locations 602 - 620 are identified along the contour 600 .
  • Successive point locations along the contour 600 e.g., point location 614 and point location 616 ) are separated by the distance D which is measured along the contour.
  • FIG. 7 illustrates an exemplary embodiment, in simplified form, of a contour point map.
  • the contour point map 700 exemplified in FIG. 7 corresponds to the series of three planes 400 / 402 / 404 that are used to slice the mesh model of the human 406 in FIGS. 4A-4C .
  • the contour point map 700 includes three lines of data 702 / 704 / 706 each of which corresponds to a different one of the three planes 400 / 402 / 404 . More particularly, a first line of data 702 in the contour point map 700 corresponds to the top-most plane 400 .
  • the first line of data 702 specifies that the top-most plane 400 defines one contour (namely, contour 408 ) and lists each of the A total point locations that are identified along this contour (namely, P 1,1 , P 1,2 , . . . , P 1,A ).
  • the second line of data 704 specifies that the middle plane 402 defines two contours (namely, contour 410 and contour 412 ).
  • the second line of data 704 also lists each of the B total point locations that are identified along contour 410 (namely, P 1,1 , P 1,2 , . . . , P 1,B ), and then lists each of the C total point locations that are identified along contour 412 (namely, P 2,1 , P 2,2 , . . . , P 2,C ).
  • the third line of data 706 specifies that the bottom-most plane 404 defines three contours (namely, contour 414 , contour 416 , and contour 418 ).
  • the third line of data 706 also lists each of the D total point locations that are identified along contour 414 (namely, P 1,1 , P 1,2 , . . . , P 1,D ), and then lists each of the E total point locations that are identified along contour 416 (namely, P 2,1 , P 2,2 , . . . , P 2,E ), and then lists each of the F total point locations that are identified along contour 418 (namely, P 3,1 , P 3,2 , . . . , P 3,F ).
  • FIG. 12 illustrates an exemplary embodiment, in simplified form, of a process for assigning the texels in the scanline of the texture map that corresponds to the plane to the contours that are defined by the plane.
  • the process embodiment exemplified in FIG. 12 generally determines what percentage of the texels in the scanline are to be assigned to each of the contours that is defined by the plane.
  • the process starts in block 1200 with the following actions taking place for each of the contours that are defined by the plane.
  • the length of the contour is calculated (block 1202 ).
  • the normalized length of the contour is then calculated by dividing the length of the contour by the sum of the lengths of all of the contours that are defined by the plane (block 1204 ).
  • the number of texels in the scanline that are to be assigned to the contour is then calculated by multiplying the normalized length of the contour and the total number of texels that is in the scanline ( 1206 ), and this calculated number of texels is assigned to the contour (block 1208 ).
  • FIG. 14 illustrates a simplified example of the assignment of the texels in two neighboring scanlines of an exemplary texture map to the exemplary contours that are defined by two neighboring planes that correspond to the two neighboring scanlines.
  • a first scanline 1400 in the texture map (not shown) has a series of six texels T — 1,1-T — 1,6, and a second scanline 1402 which immediate succeeds the first scanline in the texture map has another series of six texels T — 2,1-T — 2,6.
  • contours 1404 and 1406 that are defined by a first plane (not shown) in a series of planes (not shown), and there are another two contours 1408 and 1410 that are defined by a second plane (not shown) which immediately succeeds the first plane in the series of planes.
  • the first four texels T — 1,1-T — 1,4 in the first scanline 1400 will be assigned to contour 1404
  • the last two texels T — 1,5 and T — 1,6 in the first scanline will be assigned to contour 1406 .
  • the first two texels T — 2,1 and T — 2,2 in the second scanline 1402 will be assigned to contour 1408
  • the last four texels T — 2,3-T — 2,6 in the second scanline will be assigned to contour 1410 .
  • the series of planes that is used to slice each of the mesh models has a prescribed spatial orientation and a prescribed geometry.
  • the geometry of the series of planes can be specified using various types of data. Examples of such types of data include, but are not limited to, data specifying the number of planes in the series of planes, data specifying a prescribed spacing that is used between successive planes in the series of planes, and data specifying the shape and dimensions of each of the planes in the series of planes.
  • Both the spatial orientation and the geometry of the series of planes are arbitrary and as such, various spatial orientations and geometries can be used for the series of planes.
  • the series of planes has a horizontal spatial orientation.
  • the series of planes has a vertical spatial orientation. It will be appreciated that any number of planes can be used, any spacing between successive planes can be used, any plane shape/dimensions can be used, and any distance which separates successive point locations along the contours can be used. In an exemplary embodiment of the video generation technique described herein the spacing that is used between successive planes in the series of planes is selected such that the series of planes intersects a maximum number of vertices in each of the mesh models. This is advantageous in a situation where each of the mesh models of the scene includes a mesh texture map that defines texture data which has already been computed for the vertices of the mesh model.
  • the spatial orientation and geometry of the series of planes, the order across each of the planes in the series of planes by which each of the contours that is defined by the plane is analyzed (hereafter simply referred to as the contour analysis order), the number of texels in each of the scanlines, the zero position on each of the contours (hereafter simply referred to as the contour zero position), and the number of times each of the scanlines in the texture map is replicated are pre-determined and thus are known to each of the end user computing devices in advance of the scene proxies being distributed thereto.
  • one or more of the spatial orientation of the series of planes, or the geometry of the series of planes, or the contour analysis order, or the number of texels in each of the scanlines, or the contour zero position, or the number of times each of the scanlines in the texture map is replicated may not be pre-determined and thus may not be known to each of the end user computing devices in advance of the scene proxies being distributed thereto.
  • FIG. 8 illustrates an exemplary embodiment, in simplified form, of a process for storing the scene proxies.
  • the following actions take place for each of the mesh models (block 800 ).
  • the mathematical equation describing each of the contours that is defined by the series of planes is stored (block 802 ).
  • Data specifying which contours on neighboring planes in the series of planes correspond to each other e.g., which neighboring contours are associated with either the same object or the same surface of a given object in the scene
  • the texture map for the mesh model is also stored (block 806 ). Whenever the spatial orientation of the series of planes is not pre-determined, data specifying this spatial orientation will also be stored (block 808 ).
  • the mathematical equation describing a given contour specifies a polygon approximation of the contour.
  • the mathematical equation describing a given contour specifies a non-uniform rational basis spline (NURBS) curve approximation of the contour.
  • NURBS non-uniform rational basis spline
  • FIG. 9 illustrates an exemplary embodiment, in simplified form, of a process for distributing the scene proxies to an end user who either is, or will be, viewing the video on another computing device which is connected to a data communication network.
  • the following actions take place for each of the mesh models (block 900 ).
  • the mathematical equation describing each of the contours that is defined by the series of planes is transmitted over the network to the other computing device (block 902 ).
  • Data specifying which contours on neighboring planes in the series of planes correspond to each other is also transmitted over the network to the other computing device (block 904 ).
  • the texture map for the mesh model is also transmitted over the network to the other computing device (block 906 ).
  • contour zero position is not pre-determined
  • data specifying the contour zero position will also be transmitted over the network to the other computing device (block 916 ).
  • data specifying the number of times each of the scanlines in the texture map is replicated will also be transmitted over the network to the other computing device (block 918 ).
  • This section provides a more detailed description of the end user presentation stage of the video processing pipeline.
  • FIG. 10 illustrates an exemplary embodiment, in simplified form, of a process for presenting a video of a scene to an end user.
  • the process starts in block 1000 with receiving scene proxies that geometrically describe the scene as a function of time.
  • the scene proxies include a stream of mathematical equations which describe contours that are defined by a series of planes that are parallel to each other.
  • the scene proxies also include a stream of data specifying which contours on neighboring planes in the series of planes correspond to each other.
  • the scene proxies also include a stream of texture maps which define texture data corresponding to each of the contours that is defined by the series of planes.
  • images of the scene are rendered from the scene proxies (block 1002 ).
  • the images of the scene are then displayed on a display device (block 1004 ) so that they can be viewed and navigated by the end user.
  • the video that is being presented to the end user can be any type of video including, but not limited to, asynchronous single viewpoint video, or asynchronous free viewpoint video, or unidirectional live single viewpoint video, or unidirectional live free viewpoint video, or bidirectional live single viewpoint video, or bidirectional live free viewpoint video.
  • FIG. 11 illustrates an exemplary embodiment, in simplified form, of a process for rendering images of the scene from the scene proxies.
  • the process starts in block 1100 with constructing the series of planes using data specifying the spatial orientation of the series of planes and the geometry of the series of planes.
  • the contours that are defined by the series of planes are then constructed using the stream of mathematical equations (block 1102 ).
  • a series of point locations is then constructed along each of the contours that is defined by the series of planes (block 1104 ), where this construction is performed in the aforementioned prescribed order across each of the planes in the series of planes, and this construction is also performed starting from the aforementioned prescribed zero position on each of the contours that is defined by each of the planes in the series of planes.
  • a common number of point locations is constructed along each of the contours, and these point locations can be interspersed along the contour such that they are equidistant from each other.
  • blocks 1100 , 1102 and 1104 generate a stream of 3D point models that generally correspond to the stream of mesh models of the scene that was originally sliced using the series of planes. It will also be appreciated that this common number of point locations can have any value; a trade-off associated with the selection of this value is described in more detail hereafter.
  • each of the mathematical equations specifies how the particular contour it describes is positioned on the particular plane that defines this contour (e.g., by specifying one or more control points for the contour, among other ways).
  • An alternate embodiment of the video generation technique is also possible where each of the mathematical equations does not specify how the particular contour it describes is positioned on the particular plane that defines this contour, in which case this positioning information will be separately specified, stored and distributed to the end user.
  • each polygonal model includes a collection of polygonal faces that are formed by neighboring point locations on corresponding contours on neighboring planes in the series of planes (block 1106 ).
  • the polygonal faces are either triangles, or quadrilaterals, or any other prescribed type of polygon. It will be appreciated that the action of block 1106 serves to recreate the stream of mesh models of the scene.
  • the stream of texture maps is sampled to identify the texture data that corresponds to each of the polygonal faces in the stream of polygonal models (block 1108 ). Conventional methods are then employed to use this identified texture data to add texture to each of the polygonal faces in the stream of polygonal models (block 1110 ).
  • FIG. 15 illustrates an exemplary embodiment, in simplified form, of a process for sampling the stream of texture maps to identify the texture data that corresponds to each of the polygonal faces in the stream of polygonal models.
  • this particular embodiment serves to minimize any sampling errors that may occur during the sampling of the stream of texture maps.
  • FIG. 15 illustrates an exemplary embodiment, in simplified form, of a process for sampling the stream of texture maps to identify the texture data that corresponds to each of the polygonal faces in the stream of polygonal models.
  • the process starts in block 1500 with an action of, for each of the scanlines in each of the texture maps, adapting the number of texels in the scanline that are assigned to each one of the contours that is defined by the plane corresponding to the scanline to be the average of the number of texels in the scanline that are assigned to this one of the contours and the number of texels in the next scanline in the series of scanlines that are assigned to another contour that corresponds to this one of the contours, where this adaption results in a modified version of each of the texture maps.
  • the modified version of each of the texture maps is then sampled to identify the texture data that corresponds to each of the polygonal faces in the stream of polygonal models (block 1502 ).
  • the just-described adaption operates in the following manner.
  • the adaption of the number of texels in the scanline that are assigned to the particular contour involves using conventional methods to contract a series of texels in the scanline.
  • the adaption of the number of texels in the scanline that are assigned to the particular contour involves using conventional methods to expand a series of texels in the scanline.
  • any sampling errors that may occur during the sampling of the stream of texture maps can further be minimized by optionally performing an action of, for each of the scanlines in each of the texture maps, inserting one or more “gutter texels” between neighboring texel series in the scanline, and optionally also inserting one or more “gutter scanlines” between neighboring scanlines in each of the texture maps. It will be appreciated that implementing such gutter texels and gutter scanlines serves to prevent bleed-over when the stream of texture maps is sampled using certain sampling methods such as the conventional bilinear interpolation method.
  • This section provides a more detailed description of exemplary types of single viewpoint video and exemplary types of free viewpoint video that are supported by the video generation technique embodiments described herein.
  • one implementation of the single viewpoint embodiment of the video generation technique described heretofore supports asynchronous (i.e., non-live) single viewpoint video
  • a similar implementation of the free viewpoint embodiment of the video generation technique described heretofore supports asynchronous free viewpoint video.
  • Both of these implementations correspond to a situation where the streams of sensor data that are generated by the sensors are pre-captured 104 , then post-processed 106 , and the resulting scene proxies are then stored and can be transmitted in a one-to-many manner (i.e., broadcast) to one or more end users 108 .
  • a one-to-many manner i.e., broadcast
  • a video producer to optionally manually “touch-up” the streams of sensor data that are received during the capture sub-stage 104 , and also optionally manually remove any three-dimensional reconstruction artifacts that are introduced in the processing sub-stage 106 .
  • These particular implementations are referred to hereafter as the asynchronous single viewpoint video implementation and the asynchronous free viewpoint video implementation respectively.
  • Exemplary types of video-based media that work well in the asynchronous single viewpoint video and asynchronous free viewpoint video implementations include movies, documentaries, sitcoms and other types of television shows, music videos, digital memories, and the like.
  • Another exemplary type of video-based media that works well in the asynchronous single viewpoint video and asynchronous free viewpoint video implementations is the use of special effects technology where synthetic objects are realistically modeled, lit, shaded and added to a pre-captured scene.
  • another implementation of the single viewpoint embodiment of the video generation technique supports unidirectional (i.e., one-way) live single viewpoint video
  • a similar implementation of the free viewpoint embodiment of the video generation technique supports unidirectional live free viewpoint video.
  • Both of these implementations correspond to a situation where the streams of sensor data that are being generated by the sensors are concurrently captured 104 and processed 106 , and the resulting scene proxies are stored and transmitted in a one-to-many manner on-the-fly (i.e., live) to one or more end users 108 .
  • each end user can view 114 the scene live (i.e., each use can view the scene at substantially the same time it is being captured 104 ).
  • unidirectional live single viewpoint video implementation and the unidirectional live free viewpoint video implementation respectively.
  • exemplary types of video-based media that work well in the unidirectional live single viewpoint video and unidirectional live free viewpoint video implementations include sporting events, news programs, live concerts, and the like.
  • yet another implementation of the single viewpoint embodiment of the video generation technique supports bidirectional (i.e., two-way) live single viewpoint video (such as that which is associated with various video-conferencing/telepresence applications), and a similar implementation of the free viewpoint embodiment of the video generation technique supports bidirectional live free viewpoint video.
  • bidirectional live single viewpoint video implementation and the bidirectional live free viewpoint video implementation respectively.
  • the bidirectional live single/free viewpoint video implementation is generally the same as the unidirectional live single/free viewpoint video implementation with the following exception.
  • a computing device at each physical location that is participating in a given video-conferencing/telepresence session is able to concurrently capture 104 streams of sensor data that are being generated by sensors which are capturing a local scene and process 106 these locally captured streams of sensor data, store and transmit the resulting local scene proxies in a one-to-many manner on the fly to the other physical locations that are participating in the session 108 , receive remote scene proxies from each of the remote physical locations that are participating in the session 108 , and render 112 each of the received proxies.
  • the generation, storage and distribution, and end user presentation stages 102 / 108 / 110 have to be completed within a very short period of time.
  • the video generation technique embodiments described herein make this possible based on the aforementioned video size/data minimization that is achieved by the video generation technique embodiments.
  • video generation technique has been described by specific reference to embodiments thereof, it is understood that variations and modifications thereof can be made without departing from the true spirit and scope of the video generation technique.
  • alternate embodiments of the video generation technique described herein are possible which support any other digital image application where a scene is represented by a mesh model and a corresponding mesh texture map which defines texture data for the mesh model.
  • FIG. 13 illustrates a simplified example of a general-purpose computer system on which various embodiments and elements of the video generation technique, as described herein, may be implemented. It is noted that any boxes that are represented by broken or dashed lines in FIG. 13 represent alternate embodiments of the simplified computing device, and that any or all of these alternate embodiments, as described below, may be used in combination with other alternate embodiments that are described throughout this document.
  • FIG. 13 shows a general system diagram showing a simplified computing device 1300 .
  • Such computing devices can be typically be found in devices having at least some minimum computational capability, including, but not limited to, personal computers (PCs), server computers, handheld computing devices, laptop or mobile computers, communications devices such as cell phones and personal digital assistants (PDAs), multiprocessor systems, microprocessor-based systems, set top boxes, programmable consumer electronics, network PCs, minicomputers, mainframe computers, and audio or video media players.
  • PCs personal computers
  • server computers handheld computing devices
  • laptop or mobile computers communications devices such as cell phones and personal digital assistants (PDAs), multiprocessor systems, microprocessor-based systems, set top boxes, programmable consumer electronics, network PCs, minicomputers, mainframe computers, and audio or video media players.
  • PDAs personal digital assistants
  • the device should have a sufficient computational capability and system memory to enable basic computational operations.
  • the computational capability is generally illustrated by one or more processing unit(s) 1310 , and may also include one or more graphics processing units (GPUs) 1315 , either or both in communication with system memory 1320 .
  • GPUs graphics processing units
  • processing unit(s) 1310 may be specialized microprocessors (such as a digital signal processor (DSP), a very long instruction word (VLIW) processor, a field-programmable gate array (FPGA), or other micro-controller) or can be conventional central processing units (CPUs) having one or more processing cores including, but not limited to, specialized GPU-based cores in a multi-core CPU.
  • DSP digital signal processor
  • VLIW very long instruction word
  • FPGA field-programmable gate array
  • CPUs central processing units having one or more processing cores including, but not limited to, specialized GPU-based cores in a multi-core CPU.
  • the simplified computing device 1300 of FIG. 13 may also include other components, such as, for example, a communications interface 1330 .
  • the simplified computing device 1300 of FIG. 13 may also include one or more conventional computer input devices 1340 (e.g., pointing devices, keyboards, audio (e.g., voice) input/capture devices, video input/capture devices, haptic input devices, devices for receiving wired or wireless data transmissions, and the like).
  • the simplified computing device 1300 of FIG. 13 may also include other optional components, such as, for example, one or more conventional computer output devices 1350 (e.g., display device(s) 1355 , audio output devices, video output devices, devices for transmitting wired or wireless data transmissions, and the like).
  • Exemplary types of input devices (herein also referred to as user interface modalities) and display devices that are operable with the video generation technique embodiments described herein have been described heretofore.
  • user interface modalities input devices
  • display devices that are operable with the video generation technique embodiments described herein have been described heretofore.
  • typical communications interfaces 1330 , additional types of input and output devices 1340 and 1350 , and storage devices 1360 for general-purpose computers are well known to those skilled in the art, and will not be described in detail herein.
  • the simplified computing device 1300 of FIG. 13 may also include a variety of computer readable media.
  • Computer readable media can be any available media that can be accessed by the computer 1300 via storage devices 1360 , and includes both volatile and nonvolatile media that is either removable 1370 and/or non-removable 1380 , for storage of information such as computer-readable or computer-executable instructions, data structures, program modules, or other data.
  • Computer readable media may include computer storage media and communication media.
  • Computer storage media includes, but is not limited to, computer or machine readable media or storage devices such as digital versatile disks (DVDs), compact discs (CDs), floppy disks, tape drives, hard drives, optical drives, solid state memory devices, random access memory (RAM), read-only memory (ROM), electrically erasable programmable read-only memory (EEPROM), flash memory or other memory technology, magnetic cassettes, magnetic tapes, magnetic disk storage, or other magnetic storage devices, or any other device which can be used to store the desired information and which can be accessed by one or more computing devices.
  • DVDs digital versatile disks
  • CDs compact discs
  • floppy disks tape drives
  • hard drives optical drives
  • solid state memory devices random access memory (RAM), read-only memory (ROM), electrically erasable programmable read-only memory (EEPROM), flash memory or other memory technology, magnetic cassettes, magnetic tapes, magnetic disk storage, or other magnetic storage devices, or any other device which can be used to store the desired information and which can be accessed by one or more computing
  • modulated data signal or “carrier wave” generally refer to a signal that has one or more of its characteristics set or changed in such a manner as to encode information in the signal.
  • communication media includes wired media such as a wired network or direct-wired connection carrying one or more modulated data signals, and wireless media such as acoustic, radio frequency (RF), infrared, laser, and other wireless media for transmitting and/or receiving one or more modulated data signals or carrier waves.
  • wired media such as a wired network or direct-wired connection carrying one or more modulated data signals
  • wireless media such as acoustic, radio frequency (RF), infrared, laser, and other wireless media for transmitting and/or receiving one or more modulated data signals or carrier waves.
  • software, programs, and/or computer program products embodying the some or all of the various embodiments of the video generation technique described herein, or portions thereof, may be stored, received, transmitted, or read from any desired combination of computer or machine readable media or storage devices and communication media in the form of computer executable instructions or other data structures.
  • video generation technique embodiments described herein may be further described in the general context of computer-executable instructions, such as program modules, being executed by a computing device.
  • program modules include routines, programs, objects, components, data structures, and the like, that perform particular tasks or implement particular abstract data types.
  • the video generation technique embodiments may also be practiced in distributed computing environments where tasks are performed by one or more remote processing devices, or within a cloud of one or more devices, that are linked through one or more communications networks.
  • program modules may be located in both local and remote computer storage media including media storage devices.
  • the aforementioned instructions may be implemented, in part or in whole, as hardware logic circuits, which may or may not include a processor.

Abstract

Video of a scene is generated and presented to a user. A stream of mesh models of the scene is generated from one or more streams of sensor data that represent the scene. Each of the mesh models is sliced using a series of planes that are parallel to each other, where each of the planes in the series defines one or more contours each of which defines a specific region on the plane where the mesh model intersects the plane. A texture map is generated for each of the mesh models which defines texture data corresponding to each of the contours that is defined by the series of planes. Images of the scene are rendered from scene proxies that include a stream of mathematical equations describing the contours, and a stream of the texture maps. The images are displayed.

Description

    CROSS REFERENCE TO RELATED APPLICATION
  • This application claims the benefit of and priority to provisional U.S. patent application Ser. No. 61/653,983 filed May 31, 2012.
  • BACKGROUND
  • A given video generally includes one or more scenes, where each scene in the video can be either relatively static (e.g., the objects in the scene do not substantially change or move over time) or dynamic (e.g., the objects in the scene substantially change and/or move over time). As is appreciated in the art of computer graphics, polygonal modeling is commonly used to represent three-dimensional objects in a scene by approximating the surface of each object using polygons. A polygonal model of a given scene includes a collection of vertices. Two neighboring vertices that are connected by a straight line form an edge in the polygonal model. Three neighboring and non-co-linear vertices that are interconnected by three edges form a triangle in the polygonal model. Four neighboring and non-co-linear vertices that are interconnected by four edges form a quadrilateral in the polygonal model. Triangles and quadrilaterals are the most common types of polygons used in polygonal modeling, although other types of polygons may also be used depending on the capabilities of the renderer that is being used to render the polygonal model. A group of polygons that are interconnected by shared vertices are referred to as a mesh and as such, a polygonal model of a scene is also known as a mesh model. Each of the polygons that makes up a mesh is referred to as a face in the polygonal/mesh model. Accordingly, a polygonal/mesh model of a scene includes a collection of vertices, edges and polygonal (i.e., polygon-based) faces that represents/approximates the shape of each object in the scene.
  • SUMMARY
  • This Summary is provided to introduce a selection of concepts, in a simplified form, that are further described hereafter in the Detailed Description. This Summary is not intended to identify key features or essential features of the claimed subject matter, nor is it intended to be used as an aid in determining the scope of the claimed subject matter.
  • Video generation technique embodiments described herein are generally applicable to generating a video of a scene and presenting it to a user. In an exemplary embodiment of this generation, one or more streams of sensor data that represent the scene are received. Scene proxies are then generated from the streams of sensor data. This scene proxies generation includes the following actions. A stream of mesh models of the scene is generated from the streams of sensor data. Then, for each of the mesh models, the following actions take place. The mesh model is sliced using a series of planes that are parallel to each other, where each of the planes in the series defines one or more contours each of which defines a specific region on the plane where the mesh model intersects the plane. A texture map is then generated for the mesh model which defines texture data corresponding to each of the contours that is defined by the series of planes.
  • In an exemplary embodiment of the just-mentioned presentation, the scene proxies are received. The scene proxies include a stream of mathematical equations describing contours that are defined by a series of planes that are parallel to each other, and a stream of texture maps defining texture data corresponding to each of the contours that is defined by the series of planes. Images of the scene are then rendered from the scene proxies and displayed. This image rendering includes the following actions. The series of planes is constructed using data specifying the spatial orientation and geometry of the series of planes. The contours that are defined by the series of planes are then constructed using the stream of mathematical equations. A series of point locations is then constructed along each of the contours that is defined by the series of planes, where this construction is performed in a prescribed order across each of the planes in the series of planes, and this construction is also performed starting from a prescribed zero position on each of these contours. The point locations that are defined by the series of planes are then tessellated, where this tessellation generates a stream of polygonal models, and each polygonal model includes a collection of polygonal faces that are formed by neighboring point locations on corresponding contours on neighboring planes in the series of planes. The stream of texture maps is then sampled to identify the texture data that corresponds to each of the polygonal faces in the stream of polygonal models. This identified texture data is then used to add texture to each of the polygonal faces in the stream of polygonal models.
  • DESCRIPTION OF THE DRAWINGS
  • The specific features, aspects, and advantages of the video generation technique embodiments described herein will become better understood with regard to the following description, appended claims, and accompanying drawings where:
  • FIG. 1 is a diagram illustrating an exemplary embodiment, in simplified form, of a video processing pipeline for implementing the video generation technique embodiments described herein.
  • FIG. 2 is a flow diagram illustrating an exemplary embodiment, in simplified form, of a process for generating a video of a scene.
  • FIG. 3 is a flow diagram illustrating an exemplary embodiment, in simplified form, of a process for generating scene proxies that geometrically describe the scene as a function of time.
  • FIGS. 4A-4C are diagrams illustrating an exemplary embodiment, in simplified form, of a series of three planes that are parallel to each other and are used to slice a mesh model of a human.
  • FIG. 5 is a flow diagram illustrating an exemplary embodiment, in simplified form, of a process for generating a texture map for a given mesh model in a stream of mesh models of the scene.
  • FIG. 6 is a diagram illustrating an exemplary embodiment, in simplified form, of a contour and three-dimensional that are identified along the contour.
  • FIG. 7 is a diagram illustrating an exemplary embodiment, in simplified form, of a contour point map.
  • FIG. 8 is a flow diagram illustrating an exemplary embodiment, in simplified form, of a process for storing the scene proxies.
  • FIG. 9 is a flow diagram illustrating an exemplary embodiment, in simplified form, of a process for distributing the scene proxies to an end user who either is, or will be, viewing the video.
  • FIG. 10 is a flow diagram illustrating an exemplary embodiment, in simplified form, of a process for presenting the video of the scene to an end user.
  • FIG. 11 is a flow diagram illustrating an exemplary embodiment, in simplified form, of a process for rendering images of the scene from the scene proxies.
  • FIG. 12 is a flow diagram illustrating an exemplary embodiment, in simplified form, of a process for assigning texels in a given scanline of the texture map to contours that are defined by a given plane in a series of planes that are used to slice the mesh model.
  • FIG. 13 is a diagram illustrating a simplified example of a general-purpose computer system on which various embodiments and elements of the video generation technique, as described herein, may be implemented.
  • FIG. 14 is a diagram illustrating a simplified example of the assignment of the texels in two neighboring scanlines of an exemplary texture map to exemplary contours that are defined by two neighboring planes that correspond to the two neighboring scanlines.
  • FIG. 15 is a flow diagram illustrating an exemplary embodiment, in simplified form, of a process for sampling a stream of texture maps to identify texture data that corresponds to polygonal faces in a stream of polygonal models that is generated from a stream of mathematical equations describing contours that are defined by the series of planes.
  • DETAILED DESCRIPTION
  • In the following description of video generation technique embodiments reference is made to the accompanying drawings which form a part hereof, and in which are shown, by way of illustration, specific embodiments in which the video generation technique can be practiced. It is understood that other embodiments can be utilized and structural changes can be made without departing from the scope of the video generation technique embodiments.
  • It is also noted that for the sake of clarity specific terminology will be resorted to in describing the video generation technique embodiments described herein and it is not intended for these embodiments to be limited to the specific terms so chosen. Furthermore, it is to be understood that each specific term includes all its technical equivalents that operate in a broadly similar manner to achieve a similar purpose. Reference herein to “one embodiment”, or “another embodiment”, or an “exemplary embodiment”, or an “alternate embodiment”, or “one implementation”, or “another implementation”, or an “exemplary implementation”, or an “alternate implementation” means that a particular feature, a particular structure, or particular characteristics described in connection with the embodiment or implementation can be included in at least one embodiment of the video generation technique. The appearances of the phrases “in one embodiment”, “in another embodiment”, “in an exemplary embodiment”, “in an alternate embodiment”, “in one implementation”, “in another implementation”, “in an exemplary implementation”, and “in an alternate implementation” in various places in the specification are not necessarily all referring to the same embodiment or implementation, nor are separate or alternative embodiments/implementations mutually exclusive of other embodiments/implementations. Yet furthermore, the order of process flow representing one or more embodiments or implementations of the video generation technique does not inherently indicate any particular order not imply any limitations of the video generation technique.
  • As is known in the arts of human anatomy and medical research, the Visible Human Project was conceived in the late 1980s and run by the U.S. National Library of Medicine. The goal of the Project was to create a detailed human anatomy data set using cross-sectional photographs of the human body in order to facilitate anatomy visualization applications. A convicted murderer named Joseph Paul Jernigan was executed in 1993 and his cadaver was used to provide male data for the Project. More particularly, Jernigan's cadaver was encased and frozen in a gelatin and water mixture in order to stabilize the cadaver for cutting thereof. Jernigan's cadaver was then segmented (i.e., “cut”) along its axial plane (also known as its transverse plane) at one millimeter intervals from the top of the cadaver's scalp to the soles of its feet, resulting in 1,871 “slices”. Each of these slices was photographed and digitized at a high resolution. The term “convict hull” is accordingly used herein to refer to a given plane that is used to “slice” a given mesh model of a given scene.
  • The term “sensor” is used herein to refer to any one of a variety of scene-sensing devices which can be used to generate a stream of sensor data that represents a given scene. Generally speaking and as will be described in more detail hereafter, the video generation technique embodiments described herein employ one or more sensors which can be configured in various arrangements to capture a scene, thus allowing one or more streams of sensor data to be generated each of which represents the scene from a different geometric perspective. Each of the sensors can be any type of video capture device (e.g., any type of video camera), or any type of audio capture device (such as a microphone, or the like), or any combination thereof. Each of the sensors can also be either static (i.e., the sensor has a fixed spatial location and a fixed rotational orientation which do not change over time), or moving (i.e., the spatial location and/or rotational orientation of the sensor change over time). The video generation technique embodiments described herein can employ a combination of different types of sensors to capture a given scene.
  • 1.0 Video Generation Using Convict Hulls
  • The video generation technique embodiments described herein generally involve using convict hulls to generate a video of a given scene and then present the video to one or more end users. The video generation technique embodiments support the generation, storage, distribution, and end user presentation of any type of video. By way of example but not limitation, one embodiment of the video generation technique supports various types of traditional, single viewpoint video in which the viewpoint of the scene is chosen by the director when the video is recorded/captured and this viewpoint cannot be controlled or changed by an end user while they are viewing the video. In other words, in a single viewpoint video the viewpoint of the scene is fixed and cannot be modified when the video is being rendered and displayed to an end user. Another embodiment of the video generation technique supports various types of free viewpoint video in which the viewpoint of the scene can be interactively controlled and changed by an end user at will while they are viewing the video. In other words, in a free viewpoint video an end user can interactively generate synthetic (i.e., virtual) viewpoints of the scene on-the-fly when the video is being rendered and displayed. Exemplary types of single viewpoint and free viewpoint video that are supported by the video generation technique embodiments are described in more detail hereafter.
  • The video generation technique embodiments described herein are advantageous for various reasons including, but not limited to, the following. Generally speaking and as will be appreciated from the more detailed description that follows, the video generation technique embodiments serve to minimize the size of (i.e., minimize the amount of data in) the video that is generated, stored and distributed. Based on this video size/data minimization, it will also be appreciated that the video generation technique embodiments minimize the cost and maximize the performance associated with storing and transmitting the video in a client-server framework where the video is generated and stored on a server computing device, and then transmitted from the server over a data communication network to one or more client computing devices upon which the video is rendered and then viewed and navigated by the one or more end users. Furthermore, the video generation technique embodiments maximize the photo-realism of the video that is generated when it is rendered and then viewed and navigated by the end users. As such, the video generation technique embodiments provide the end users with photo-realistic video that is free of discernible artifacts, thus creating a feeling of immersion for the end users and enhancing their viewing experience.
  • Additionally, the video generation technique embodiments described herein eliminate having to constrain the complexity or composition of the scene that is being captured (e.g., neither the environment(s) in the scene, nor the types of objects in the scene, nor the number of people of in the scene, among other things has to be constrained). Accordingly, the video generation technique embodiments are operational with any type of scene, including both relatively static and dynamic scenes. The video generation technique embodiments also provide a flexible, robust and commercially viable method for generating a video, and then presenting it to one or more end users, that meets the needs of today's various creative video producers and editors. By way of example but not limitation and as will be appreciated from the more detailed description that follows, the video generation technique embodiments are applicable to various types of video-based media applications such as consumer entertainment (e.g., movies, television shows, and the like) and video-conferencing/telepresence, among others.
  • 1.1 Video Processing Pipeline
  • FIG. 1 illustrates an exemplary embodiment, in simplified form, of a video processing pipeline for implementing the video generation technique embodiments described herein. As noted heretofore, the video generation technique embodiments support the generation, storage, distribution, and end user presentation of any type of video including, but not limited to, various types of single viewpoint video and various types of free viewpoint video. As exemplified in FIG. 1, the video processing pipeline 100 start with a generation stage 102 during which, and generally speaking, scene proxies of a given scene are generated. The generation stage 102 includes a capture sub-stage 104 and a processing sub-stage 106 whose operation will now be described in more detail.
  • Referring again to FIG. 1, the capture sub-stage 104 of the video processing pipeline 100 generally captures the scene and generates one or more streams of sensor data that represent the scene. More particularly, in an embodiment of the video generation technique described herein where a single viewpoint video is being generated, stored, distributed and presented to one or more end users (hereafter simply referred to as the single viewpoint embodiment of the video generation technique), during the capture sub-stage 104 a single sensor is used to capture the scene, where the single sensor includes a video capture device and generates a single stream of sensor data which represents the scene from a single geometric perspective. The stream of sensor data is received from the sensor and then output to the processing sub-stage 106. In another embodiment of the video generation technique described herein where a free viewpoint video is being generated, stored, distributed and presented to one or more end users (hereafter simply referred to as the free viewpoint embodiment of the video generation technique), during the capture sub-stage 104 an arrangement of sensors is used to capture the scene, where the arrangement includes a plurality of video capture devices and generates a plurality of streams of sensor data each of which represents the scene from a different geometric perspective. These streams of sensor data are received from the sensors and calibrated, and then output to the processing sub-stage 106.
  • Referring again to FIG. 1, the processing sub-stage 106 of the video processing pipeline 100 receives the stream(s) of sensor data from the capture sub-stage 104, and then generates scene proxies that geometrically describe the captured scene as a function of time from the stream(s) of sensor data. The scene proxies are then output to a storage and distribution stage 108.
  • Referring again to FIG. 1, the storage and distribution stage 108 of the video processing pipeline 100 receives the scene proxies from the processing sub-stage 106, stores the scene proxies, outputs the scene proxies and distributes them to one or more end users who either are, or will be, viewing the video, or both. In an exemplary embodiment of the video generation technique described herein where the generation stage 102 is implemented on one computing device (or a collection of computing devices) and an end user presentation stage 110 of the pipeline 100 is implemented on one or more end user computing devices, this distribution takes place by transmitting the scene proxies over whatever one or more data communication networks the end user computing devices are connected to. It will be appreciated that this transmission is implemented in a manner that meets the needs of the specific implementation of the video generation technique embodiments and the related type of video that is being processed in the pipeline 100.
  • Referring again to FIG. 1 and generally speaking, the end user presentation stage 110 of the video processing pipeline 100 receives the scene proxies that are output from the storage and distribution stage 108, and then presents each of the end users with a rendering of the scene proxies. The end user presentation stage 110 includes a rendering sub-stage 112 and a user viewing experience sub-stage 114 whose operation will now be described in more detail.
  • Referring again to FIG. 1, in the single viewpoint embodiment of the video generation technique described herein the rendering sub-stage 112 of the video processing pipeline 100 receives the scene proxies that are output from the storage and distribution stage 108, and then renders images of the captured scene from the scene proxies, where these images have a fixed viewpoint that cannot be modified by an end user. The fixed viewpoint images of the captured scene are then output to the user viewing experience sub-stage 114 of the pipeline 100. The user viewing experience sub-stage 114 receives the fixed viewpoint images of the captured scene from the rendering sub-stage 112, and then displays these images on a display device for viewing by a given end user. In situations where the generation stage 102 operates asynchronously from the end user presentation stage 110 (such as in the asynchronous single viewpoint video implementation that is described in more detail hereafter), the user viewing experience sub-stage 114 can provide the end user with the ability to interactively temporally navigate/control the single viewpoint video at will, and based on this temporal navigation/control the rendering sub-stage 112 will either temporally pause/stop, or rewind, or fast forward the single viewpoint video accordingly.
  • Referring again to FIG. 1, in the free viewpoint embodiment of the video generation technique described herein the rendering sub-stage 112 receives the scene proxies that are output from the storage and distribution stage 108, and then renders images of the captured scene from the scene proxies, where these images have a synthetic viewpoint that can be modified by an end user. The synthetic viewpoint images of the captured scene are then output to the user viewing experience sub-stage 114. The user viewing experience sub-stage 114 receives the synthetic viewpoint images of the captured scene from the rendering sub-stage 112, and then displays these images on a display device for viewing by a given end user. Generally speaking, the user viewing experience sub-stage 114 can provide the end user with the ability to spatio/temporally navigate/control the synthetic viewpoint images of the captured scene on-the-fly at will. In other words, the user viewing experience sub-stage 114 can provide the end user with the ability to continuously and interactively navigate/control their viewpoint of the images of the scene that are being displayed on the display device, and based on this viewpoint navigation the rendering sub-stage 112 will modify the images of the scene accordingly. In situations where the generation stage 102 operates asynchronously from the end user presentation stage 110 (such as in the asynchronous free viewpoint video implementation that is described in more detail hereafter), the user viewing experience sub-stage 114 can also provide the end user with the ability to interactively temporally navigate/control the free viewpoint video at will, and based on this temporal navigation/control the rendering sub-stage 112 will either temporally pause/stop, or rewind, or fast forward the free viewpoint video accordingly.
  • 1.2 Video Generation
  • This section provides a more detailed description of the generation stage of the video processing pipeline. As described heretofore, the video generation technique embodiments described herein generally employ one or more sensors which can be configured in various arrangements to capture a scene. These one or more sensors generate one or more streams of sensor data each of which represents the scene from a different geometric perspective.
  • FIG. 2 illustrates an exemplary embodiment, in simplified form, of a process for generating a video of a scene. As exemplified in FIG. 2, the process starts in block 200 with receiving the one or more streams of sensor data that represent the scene. Scene proxies are then generated from these streams of sensor data (block 202), where the scene proxies geometrically describe the scene as a function of time. The scene proxies can then be stored (block 204). In a situation where a given end user either is, or will be, viewing the video on another computing device which is connected to a data communication network, the scene proxies can also be distributed to the end user (block 206).
  • FIG. 3 illustrates an exemplary embodiment, in simplified form, of a process for generating the scene proxies from the one or more streams of sensor data that represent the scene. As exemplified in FIG. 3, the process starts in block 300 with generating a stream of mesh models of the scene from the streams of sensor data, where each of the mesh models includes a collection of vertices and a collection of polygonal faces that are formed by the vertices. The following actions then take place for each of the mesh models (block 302). The mesh model is sliced using a series of planes that are parallel to each other, where each of the planes in the series of planes defines one or more contours each of which defines a specific region on the plane where the mesh model intersects the plane (block 304). A texture map for the mesh model is then generated, where the texture map defines texture data corresponding to each of the contours that is defined by the series of planes (block 306). It will be appreciated that this texture map can be generated using various methods, one example of which is described in more detail hereafter.
  • FIGS. 4A-4C illustrate an exemplary embodiment, in simplified form, of a series of three planes that are parallel to each other and are used to slice a mesh model of a human. As exemplified in FIGS. 4A-4C, the series of planes 400/402/404 has a horizontal spatial orientation. As exemplified in FIG. 4A, the top-most plane 400 in the series of planes slices the mesh model 406 substantially along the shoulder region of the human and defines a single contour 408 along this region. As exemplified in FIG. 4B, the middle plane 402 in the series of planes slices the mesh model 406 substantially along the right bicep region, chest region and left bicep region of the human and defines two different contours 410 and 412. More particularly, the middle plane 402 defines a contour 410 along the right bicep region of the human. The middle plane 402 defines another contour 412 along the chest and left bicep regions of the human. As exemplified in FIG. 4C, the bottom-most plane 404 in the series of planes slices the mesh model 406 substantially along the right elbow region, upper stomach region, right hand region, left forearm region, and left hand region of the human and defines three different contours 414/416/418. More particularly, the bottom-most plane 404 defines a contour 414 along the right elbow region of the human. The bottom-most plane 404 defines another contour 416 along the upper stomach, left hand and left forearm regions of the human. The bottom-most plane 404 defines yet another contour 418 along the right hand region of the human.
  • FIG. 5 illustrates an exemplary embodiment, in simplified form, of a process for generating a texture map for a given mesh model which defines texture data corresponding to each of the contours that is defined by the series of planes. This particular embodiment assumes that the texture map includes a series of scanlines each of which corresponds to a different one of the planes in the series of planes, where each of the scanlines includes a series of texels, and the total number of texels in each of the scanlines is the same. It will be appreciated that any number of texels per scanline can be used. In an exemplary embodiment of the video generation technique described herein the total number of texels in each of the scanlines is 1024. As exemplified in FIG. 5, the process starts in block 500 with the following actions taking place for each of the planes in the series of planes. Each of the contours that is defined by the plane is analyzed in a prescribed order across the plane to identify a series of point locations along the contour (block 502). This analysis of each of the contours that is defined by the plane is performed starting from a prescribed zero position on the contour, where this zero position is the same for each of the contours that is defined by the plane. In an exemplary embodiment of the video generation technique described herein successive point locations along the contour are separated by a prescribed distance which is measured along the contour, and the just-described order, distance and zero position are the same for each of the planes in the series of planes.
  • Referring again to FIG. 5, once each of the contours that is defined by the plane has been analyzed (block 502), for each of these contours, the series of point locations that is identified along the contour is used to determine a mathematical equation describing the contour (block 504), where this determination is made using conventional methods. The point locations that are identified along each of the contours that is defined by the plane can then optionally be entered into a contour point map (block 506), where these point locations are entered into the contour point map in the order in which they are identified. In one embodiment of the video generation technique described herein each of the point locations in the contour point map is represented by its two-dimensional coordinates on the particular plane on which it lies. In another embodiment of the video generation technique each of the point locations in the contour point map is represented by its angle from the point location that immediately precedes it on its contour. It will be appreciated that the contour point map may be used for additional types of processing. Once a mathematical equation describing each of the contours that is defined by the plane has been determined (block 504), the texels in the scanline of the texture map that corresponds to the plane are assigned to these contours, where this texel assignment is performed in the aforementioned prescribed order across the plane (block 508). It will be appreciated that the assignment (block 508) can be performed in various ways, one example of which is described in more detail hereafter. Information specifying the texel assignment (e.g., which of the texels in the scanline is assigned to which of the contours defined by the plane) is then entered into the texture map (block 510).
  • Referring again to FIG. 5, once the actions of block 500 have been completed, the one or more streams of sensor data that represent the scene are used to compute texture data for each of the texels that is in the texture map (block 512), and this computed texture data is entered into the texture map (block 514). In an exemplary embodiment of the video generation technique described herein, this texture data computation is performed in the following manner. For each of the texels that is in the texture map, a projective texture mapping method is used to sample each of the streams of sensor data that represent the scene and combine texture information from each of these samples to generate texture data for the texel. It is noted that in a situation where just a single stream of sensor data is captured (i.e., in the aforementioned single viewpoint implementation of the video generation technique), just a single stream of sensor data will be sampled. Generally speaking and as is appreciated in the art of parametric texture mapping, the video generation technique embodiments described herein can compute various textures in order to maximize the photo-realism of the objects that are represented by the mesh models. In other words, the texture data that is computed for each of the texels that is in the texture map can include one or more of color data, or specular highlight data, or transparency data, or reflection data, or shadowing data, among other things.
  • Referring again to FIG. 5, once the computed texture data for each of the texels has been entered into the texture map (block 514), each of the scanlines in the texture map can optionally be replicated a prescribed number of times, where this number is greater than or equal to one (block 516). By way of example but not limitation, in the case where this prescribed number of times is one, each of the scanlines in the texture map will be replicated once so that two neighboring scanlines in the texture map will correspond to each of the planes in the series of planes. It is noted that a trade-off exists in the selection of the number of times each of the scanlines in the texture map is replicated. More particularly, replicating each of the scanlines a larger number of times creates more texture resolution in the images of the scene that are rendered from the scene proxies and displayed to the end users, but can also increase the amount of storage and network communication resources that are used in the aforementioned storage and distribution stage of the video processing pipeline exemplified in FIG. 1, and can also increase the amount of processing that is associated with the aforementioned rendering sub-stage of the video processing pipeline. On the other hand, replicating each of the scanlines a smaller number of times creates less texture resolution in the images of the scene that are rendered, but can also decrease the amount of storage and network communication resources that are used in the storage and distribution stage, and can also decrease the amount of processing that is associated with the rendering sub-stage.
  • It is noted that the contours that are defined by a given plane can be analyzed in any order across the plane. By way of example but not limitation, in an exemplary embodiment of the video generation technique described herein the contours that are defined by each of the planes are analyzed in a left-to-right order across the plane. It is also noted that the just-described zero position on the contour can be any position thereon as long as it is the same for each of the contours being analyzed. In an exemplary embodiment of the video generation technique this zero position is the left-most point on the contour. It is also noted that a trade-off exists in the selection of the distance which separates successive point locations along the contours that are defined by the series of planes. More particularly, using a smaller distance between successive point locations along the contours creates a finer approximation of each of the mesh models, but also increases the amount of processing that is associated with the aforementioned processing sub-stage of the video processing pipeline exemplified in FIG. 1. On the other hand, using a larger distance between successive point locations along the contours creates a coarser approximation of each of the mesh models, but also decreases the amount of processing that is associated with the processing sub-stage.
  • FIG. 6 illustrates an exemplary embodiment, in simplified form, of a contour and point locations that are identified along the contour. As exemplified in FIG. 6, a series of point locations 602-620 are identified along the contour 600. Successive point locations along the contour 600 (e.g., point location 614 and point location 616) are separated by the distance D which is measured along the contour.
  • FIG. 7 illustrates an exemplary embodiment, in simplified form, of a contour point map. The contour point map 700 exemplified in FIG. 7 corresponds to the series of three planes 400/402/404 that are used to slice the mesh model of the human 406 in FIGS. 4A-4C. The contour point map 700 includes three lines of data 702/704/706 each of which corresponds to a different one of the three planes 400/402/404. More particularly, a first line of data 702 in the contour point map 700 corresponds to the top-most plane 400. The first line of data 702 specifies that the top-most plane 400 defines one contour (namely, contour 408) and lists each of the A total point locations that are identified along this contour (namely, P1,1, P1,2, . . . , P1,A). The second line of data 704 specifies that the middle plane 402 defines two contours (namely, contour 410 and contour 412). The second line of data 704 also lists each of the B total point locations that are identified along contour 410 (namely, P1,1, P1,2, . . . , P1,B), and then lists each of the C total point locations that are identified along contour 412 (namely, P2,1, P2,2, . . . , P2,C). The third line of data 706 specifies that the bottom-most plane 404 defines three contours (namely, contour 414, contour 416, and contour 418). The third line of data 706 also lists each of the D total point locations that are identified along contour 414 (namely, P1,1, P1,2, . . . , P1,D), and then lists each of the E total point locations that are identified along contour 416 (namely, P2,1, P2,2, . . . , P2,E), and then lists each of the F total point locations that are identified along contour 418 (namely, P3,1, P3,2, . . . , P3,F).
  • FIG. 12 illustrates an exemplary embodiment, in simplified form, of a process for assigning the texels in the scanline of the texture map that corresponds to the plane to the contours that are defined by the plane. As will now be described in more detail, the process embodiment exemplified in FIG. 12 generally determines what percentage of the texels in the scanline are to be assigned to each of the contours that is defined by the plane. The process starts in block 1200 with the following actions taking place for each of the contours that are defined by the plane. The length of the contour is calculated (block 1202). The normalized length of the contour is then calculated by dividing the length of the contour by the sum of the lengths of all of the contours that are defined by the plane (block 1204). The number of texels in the scanline that are to be assigned to the contour is then calculated by multiplying the normalized length of the contour and the total number of texels that is in the scanline (1206), and this calculated number of texels is assigned to the contour (block 1208).
  • FIG. 14 illustrates a simplified example of the assignment of the texels in two neighboring scanlines of an exemplary texture map to the exemplary contours that are defined by two neighboring planes that correspond to the two neighboring scanlines. As exemplified in FIG. 14, a first scanline 1400 in the texture map (not shown) has a series of six texels T 1,1-T 1,6, and a second scanline 1402 which immediate succeeds the first scanline in the texture map has another series of six texels T 2,1-T 2,6. There are two contours 1404 and 1406 that are defined by a first plane (not shown) in a series of planes (not shown), and there are another two contours 1408 and 1410 that are defined by a second plane (not shown) which immediately succeeds the first plane in the series of planes. Based on the normalization of the lengths of the two contours 1404 and 1406 that are defined by the first plane, the first four texels T 1,1-T 1,4 in the first scanline 1400 will be assigned to contour 1404, and the last two texels T 1,5 and T 1,6 in the first scanline will be assigned to contour 1406. Similarly, based on the normalization of the lengths of the two contours 1408 and 1410 that are defined by the second plane, the first two texels T 2,1 and T 2,2 in the second scanline 1402 will be assigned to contour 1408, and the last four texels T 2,3-T 2,6 in the second scanline will be assigned to contour 1410.
  • The series of planes that is used to slice each of the mesh models has a prescribed spatial orientation and a prescribed geometry. It will be appreciated that the geometry of the series of planes can be specified using various types of data. Examples of such types of data include, but are not limited to, data specifying the number of planes in the series of planes, data specifying a prescribed spacing that is used between successive planes in the series of planes, and data specifying the shape and dimensions of each of the planes in the series of planes. Both the spatial orientation and the geometry of the series of planes are arbitrary and as such, various spatial orientations and geometries can be used for the series of planes. In one embodiment of the video generation technique the series of planes has a horizontal spatial orientation. In another embodiment of the video generation technique the series of planes has a vertical spatial orientation. It will be appreciated that any number of planes can be used, any spacing between successive planes can be used, any plane shape/dimensions can be used, and any distance which separates successive point locations along the contours can be used. In an exemplary embodiment of the video generation technique described herein the spacing that is used between successive planes in the series of planes is selected such that the series of planes intersects a maximum number of vertices in each of the mesh models. This is advantageous in a situation where each of the mesh models of the scene includes a mesh texture map that defines texture data which has already been computed for the vertices of the mesh model.
  • In one embodiment of the video generation technique described herein the spatial orientation and geometry of the series of planes, the order across each of the planes in the series of planes by which each of the contours that is defined by the plane is analyzed (hereafter simply referred to as the contour analysis order), the number of texels in each of the scanlines, the zero position on each of the contours (hereafter simply referred to as the contour zero position), and the number of times each of the scanlines in the texture map is replicated are pre-determined and thus are known to each of the end user computing devices in advance of the scene proxies being distributed thereto. In another embodiment of the video generation technique one or more of the spatial orientation of the series of planes, or the geometry of the series of planes, or the contour analysis order, or the number of texels in each of the scanlines, or the contour zero position, or the number of times each of the scanlines in the texture map is replicated, may not be pre-determined and thus may not be known to each of the end user computing devices in advance of the scene proxies being distributed thereto.
  • FIG. 8 illustrates an exemplary embodiment, in simplified form, of a process for storing the scene proxies. As exemplified in FIG. 8, the following actions take place for each of the mesh models (block 800). The mathematical equation describing each of the contours that is defined by the series of planes is stored (block 802). Data specifying which contours on neighboring planes in the series of planes correspond to each other (e.g., which neighboring contours are associated with either the same object or the same surface of a given object in the scene) is also stored (block 804). The texture map for the mesh model is also stored (block 806). Whenever the spatial orientation of the series of planes is not pre-determined, data specifying this spatial orientation will also be stored (block 808). Whenever the geometry of the series of planes is not pre-determined, data specifying this geometry will also be stored (block 810). Whenever the contour analysis order is not pre-determined, data specifying the contour analysis order will also be stored (block 812). Whenever the number of texels in each of the scanlines is not pre-determined, data specifying the number of texels in each of the scanlines will also be stored (block 814). Whenever the contour zero position is not pre-determined, data specifying the contour zero position will also be stored (block 816). Whenever the number of times each of the scanlines in the texture map is replicated is not pre-determined, data specifying the number of times each of the scanlines in the texture map is replicated will also be stored (block 818).
  • In one embodiment of the video generation technique described herein the mathematical equation describing a given contour specifies a polygon approximation of the contour. In another embodiment of the video generation technique the mathematical equation describing a given contour specifies a non-uniform rational basis spline (NURBS) curve approximation of the contour.
  • FIG. 9 illustrates an exemplary embodiment, in simplified form, of a process for distributing the scene proxies to an end user who either is, or will be, viewing the video on another computing device which is connected to a data communication network. As exemplified in FIG. 9, the following actions take place for each of the mesh models (block 900). The mathematical equation describing each of the contours that is defined by the series of planes is transmitted over the network to the other computing device (block 902). Data specifying which contours on neighboring planes in the series of planes correspond to each other is also transmitted over the network to the other computing device (block 904). The texture map for the mesh model is also transmitted over the network to the other computing device (block 906). Whenever the spatial orientation of the series of planes is not pre-determined, data specifying this spatial orientation will also be transmitted over the network to the other computing device (block 908). Whenever the geometry of the series of planes is not pre-determined, data specifying this geometry will also be transmitted over the network to the other computing device (block 910). Whenever the contour analysis order is not pre-determined, data specifying the contour analysis order will also be transmitted over the network to the other computing device (block 912). Whenever the number of texels in each of the scanlines is not pre-determined, data specifying the number of texels in each of the scanlines will also be transmitted over the network to the other computing device (block 914). Whenever the contour zero position is not pre-determined, data specifying the contour zero position will also be transmitted over the network to the other computing device (block 916). Whenever the number of times each of the scanlines in the texture map is replicated is not pre-determined, data specifying the number of times each of the scanlines in the texture map is replicated will also be transmitted over the network to the other computing device (block 918).
  • 1.3 Video Presentation To End User
  • This section provides a more detailed description of the end user presentation stage of the video processing pipeline.
  • FIG. 10 illustrates an exemplary embodiment, in simplified form, of a process for presenting a video of a scene to an end user. As exemplified in FIG. 10, the process starts in block 1000 with receiving scene proxies that geometrically describe the scene as a function of time. The scene proxies include a stream of mathematical equations which describe contours that are defined by a series of planes that are parallel to each other. The scene proxies also include a stream of data specifying which contours on neighboring planes in the series of planes correspond to each other. The scene proxies also include a stream of texture maps which define texture data corresponding to each of the contours that is defined by the series of planes. After the scene proxies have been received (block 1000), images of the scene are rendered from the scene proxies (block 1002). The images of the scene are then displayed on a display device (block 1004) so that they can be viewed and navigated by the end user. As described heretofore, the video that is being presented to the end user can be any type of video including, but not limited to, asynchronous single viewpoint video, or asynchronous free viewpoint video, or unidirectional live single viewpoint video, or unidirectional live free viewpoint video, or bidirectional live single viewpoint video, or bidirectional live free viewpoint video.
  • FIG. 11 illustrates an exemplary embodiment, in simplified form, of a process for rendering images of the scene from the scene proxies. As exemplified in FIG. 11, the process starts in block 1100 with constructing the series of planes using data specifying the spatial orientation of the series of planes and the geometry of the series of planes. The contours that are defined by the series of planes are then constructed using the stream of mathematical equations (block 1102). A series of point locations is then constructed along each of the contours that is defined by the series of planes (block 1104), where this construction is performed in the aforementioned prescribed order across each of the planes in the series of planes, and this construction is also performed starting from the aforementioned prescribed zero position on each of the contours that is defined by each of the planes in the series of planes. In an exemplary embodiment of the video generation technique described herein a common number of point locations is constructed along each of the contours, and these point locations can be interspersed along the contour such that they are equidistant from each other. It will be appreciated that the actions of blocks 1100, 1102 and 1104 generate a stream of 3D point models that generally correspond to the stream of mesh models of the scene that was originally sliced using the series of planes. It will also be appreciated that this common number of point locations can have any value; a trade-off associated with the selection of this value is described in more detail hereafter.
  • The video generation technique embodiments described herein assume that each of the mathematical equations specifies how the particular contour it describes is positioned on the particular plane that defines this contour (e.g., by specifying one or more control points for the contour, among other ways). An alternate embodiment of the video generation technique is also possible where each of the mathematical equations does not specify how the particular contour it describes is positioned on the particular plane that defines this contour, in which case this positioning information will be separately specified, stored and distributed to the end user.
  • Referring again to FIG. 11, after the series of point locations has been constructed along each of the contours that is defined by the series of planes (block 1104), the point locations that are defined by the series of planes are tessellated using conventional methods, where this tessellation generates a stream of polygonal models, and each polygonal model includes a collection of polygonal faces that are formed by neighboring point locations on corresponding contours on neighboring planes in the series of planes (block 1106). Different embodiments of the video generation technique described herein are possible where the polygonal faces are either triangles, or quadrilaterals, or any other prescribed type of polygon. It will be appreciated that the action of block 1106 serves to recreate the stream of mesh models of the scene. It is noted that a trade-off exists in the selection of the common number of point locations that is constructed along each of the contours. More particularly, using a larger common number of point locations creates denser polygonal models and thus a higher resolution in the images of the scene that are rendered from the scene proxies and displayed to the end users, but also increases the amount of processing that is associated with the aforementioned rendering sub-stage of the video processing pipeline exemplified in FIG. 1. On the other hand, using a smaller common number of point locations creates less dense polygonal models and thus a lower resolution in the images of the scene that are rendered from the scene proxies, but also decreases the amount of processing that is associated with the rendering sub-stage.
  • Referring again to FIG. 11, after the point locations that are defined by the series of planes are tessellated (block 1106), the stream of texture maps is sampled to identify the texture data that corresponds to each of the polygonal faces in the stream of polygonal models (block 1108). Conventional methods are then employed to use this identified texture data to add texture to each of the polygonal faces in the stream of polygonal models (block 1110).
  • FIG. 15 illustrates an exemplary embodiment, in simplified form, of a process for sampling the stream of texture maps to identify the texture data that corresponds to each of the polygonal faces in the stream of polygonal models. As will be appreciated from the more detailed description that follows, this particular embodiment serves to minimize any sampling errors that may occur during the sampling of the stream of texture maps. As exemplified in FIG. 15, the process starts in block 1500 with an action of, for each of the scanlines in each of the texture maps, adapting the number of texels in the scanline that are assigned to each one of the contours that is defined by the plane corresponding to the scanline to be the average of the number of texels in the scanline that are assigned to this one of the contours and the number of texels in the next scanline in the series of scanlines that are assigned to another contour that corresponds to this one of the contours, where this adaption results in a modified version of each of the texture maps. The modified version of each of the texture maps is then sampled to identify the texture data that corresponds to each of the polygonal faces in the stream of polygonal models (block 1502).
  • In an exemplary embodiment of the video generation technique described herein, the just-described adaption operates in the following manner. In the case where the number of texels in a given scanline that are assigned to a particular contour that is defined by the plane corresponding to the scanline is greater than the average of the number of texels in the scanline that are assigned to the particular contour and the number of texels in the next scanline in the series of scanlines that are assigned to another contour that corresponds to the particular contour, the adaption of the number of texels in the scanline that are assigned to the particular contour involves using conventional methods to contract a series of texels in the scanline. In the case where the number of texels in a given scanline that are assigned to a particular contour that is defined by the plane corresponding to the scanline is less than the average of the number of texels in the scanline that are assigned to the particular contour and the number of texels in the next scanline in the series of scanlines that are assigned to another contour that corresponds to the particular contour, the adaption of the number of texels in the scanline that are assigned to the particular contour involves using conventional methods to expand a series of texels in the scanline.
  • In addition to performing the just-described adaption of the number of scanline texels that are assigned to each one of the contours that is defined by the plane corresponding to each of the scanlines in each of the texture maps, any sampling errors that may occur during the sampling of the stream of texture maps can further be minimized by optionally performing an action of, for each of the scanlines in each of the texture maps, inserting one or more “gutter texels” between neighboring texel series in the scanline, and optionally also inserting one or more “gutter scanlines” between neighboring scanlines in each of the texture maps. It will be appreciated that implementing such gutter texels and gutter scanlines serves to prevent bleed-over when the stream of texture maps is sampled using certain sampling methods such as the conventional bilinear interpolation method.
  • 1.4 Supported Video Types
  • This section provides a more detailed description of exemplary types of single viewpoint video and exemplary types of free viewpoint video that are supported by the video generation technique embodiments described herein.
  • Referring again to FIG. 1, one implementation of the single viewpoint embodiment of the video generation technique described heretofore supports asynchronous (i.e., non-live) single viewpoint video, and a similar implementation of the free viewpoint embodiment of the video generation technique described heretofore supports asynchronous free viewpoint video. Both of these implementations correspond to a situation where the streams of sensor data that are generated by the sensors are pre-captured 104, then post-processed 106, and the resulting scene proxies are then stored and can be transmitted in a one-to-many manner (i.e., broadcast) to one or more end users 108. As such, there is effectively an unlimited amount of time available for the processing sub-stage 106. This allows a video producer to optionally manually “touch-up” the streams of sensor data that are received during the capture sub-stage 104, and also optionally manually remove any three-dimensional reconstruction artifacts that are introduced in the processing sub-stage 106. These particular implementations are referred to hereafter as the asynchronous single viewpoint video implementation and the asynchronous free viewpoint video implementation respectively. Exemplary types of video-based media that work well in the asynchronous single viewpoint video and asynchronous free viewpoint video implementations include movies, documentaries, sitcoms and other types of television shows, music videos, digital memories, and the like. Another exemplary type of video-based media that works well in the asynchronous single viewpoint video and asynchronous free viewpoint video implementations is the use of special effects technology where synthetic objects are realistically modeled, lit, shaded and added to a pre-captured scene.
  • Referring again to FIG. 1, another implementation of the single viewpoint embodiment of the video generation technique supports unidirectional (i.e., one-way) live single viewpoint video, and a similar implementation of the free viewpoint embodiment of the video generation technique supports unidirectional live free viewpoint video. Both of these implementations correspond to a situation where the streams of sensor data that are being generated by the sensors are concurrently captured 104 and processed 106, and the resulting scene proxies are stored and transmitted in a one-to-many manner on-the-fly (i.e., live) to one or more end users 108. As such, each end user can view 114 the scene live (i.e., each use can view the scene at substantially the same time it is being captured 104). These particular implementations are referred to hereafter as the unidirectional live single viewpoint video implementation and the unidirectional live free viewpoint video implementation respectively. Exemplary types of video-based media that work well in the unidirectional live single viewpoint video and unidirectional live free viewpoint video implementations include sporting events, news programs, live concerts, and the like.
  • Referring again to FIG. 1, yet another implementation of the single viewpoint embodiment of the video generation technique supports bidirectional (i.e., two-way) live single viewpoint video (such as that which is associated with various video-conferencing/telepresence applications), and a similar implementation of the free viewpoint embodiment of the video generation technique supports bidirectional live free viewpoint video. These particular implementations are referred to hereafter as the bidirectional live single viewpoint video implementation and the bidirectional live free viewpoint video implementation respectively. The bidirectional live single/free viewpoint video implementation is generally the same as the unidirectional live single/free viewpoint video implementation with the following exception. In the bidirectional live single/free viewpoint video implementation a computing device at each physical location that is participating in a given video-conferencing/telepresence session is able to concurrently capture 104 streams of sensor data that are being generated by sensors which are capturing a local scene and process 106 these locally captured streams of sensor data, store and transmit the resulting local scene proxies in a one-to-many manner on the fly to the other physical locations that are participating in the session 108, receive remote scene proxies from each of the remote physical locations that are participating in the session 108, and render 112 each of the received proxies.
  • Referring again to FIG. 1, it will be appreciated that in the unidirectional and bidirectional live single and free viewpoint video implementations in order for an end user to be able to view the scene live, the generation, storage and distribution, and end user presentation stages 102/108/110 have to be completed within a very short period of time. The video generation technique embodiments described herein make this possible based on the aforementioned video size/data minimization that is achieved by the video generation technique embodiments.
  • 2.0 Additional Embodiments
  • While the video generation technique has been described by specific reference to embodiments thereof, it is understood that variations and modifications thereof can be made without departing from the true spirit and scope of the video generation technique. By way of example but not limitation, rather than supporting the generation, storage, distribution, and end user presentation of video, alternate embodiments of the video generation technique described herein are possible which support any other digital image application where a scene is represented by a mesh model and a corresponding mesh texture map which defines texture data for the mesh model.
  • It is also noted that any or all of the aforementioned embodiments can be used in any combination desired to form additional hybrid embodiments. Although the video generation technique embodiments have been described in language specific to structural features and/or methodological acts, it is to be understood that the subject matter defined in the appended claims is not necessarily limited to the specific features or acts described heretofore. Rather, the specific features and acts described heretofore are disclosed as example forms of implementing the claims.
  • 3.0 Computing Environment
  • The video generation technique embodiments described herein are operational within numerous types of general purpose or special purpose computing system environments or configurations. FIG. 13 illustrates a simplified example of a general-purpose computer system on which various embodiments and elements of the video generation technique, as described herein, may be implemented. It is noted that any boxes that are represented by broken or dashed lines in FIG. 13 represent alternate embodiments of the simplified computing device, and that any or all of these alternate embodiments, as described below, may be used in combination with other alternate embodiments that are described throughout this document.
  • For example, FIG. 13 shows a general system diagram showing a simplified computing device 1300. Such computing devices can be typically be found in devices having at least some minimum computational capability, including, but not limited to, personal computers (PCs), server computers, handheld computing devices, laptop or mobile computers, communications devices such as cell phones and personal digital assistants (PDAs), multiprocessor systems, microprocessor-based systems, set top boxes, programmable consumer electronics, network PCs, minicomputers, mainframe computers, and audio or video media players.
  • To allow a device to implement the video generation technique embodiments described herein, the device should have a sufficient computational capability and system memory to enable basic computational operations. In particular, as illustrated by FIG. 13, the computational capability is generally illustrated by one or more processing unit(s) 1310, and may also include one or more graphics processing units (GPUs) 1315, either or both in communication with system memory 1320. Note that that the processing unit(s) 1310 may be specialized microprocessors (such as a digital signal processor (DSP), a very long instruction word (VLIW) processor, a field-programmable gate array (FPGA), or other micro-controller) or can be conventional central processing units (CPUs) having one or more processing cores including, but not limited to, specialized GPU-based cores in a multi-core CPU.
  • In addition, the simplified computing device 1300 of FIG. 13 may also include other components, such as, for example, a communications interface 1330. The simplified computing device 1300 of FIG. 13 may also include one or more conventional computer input devices 1340 (e.g., pointing devices, keyboards, audio (e.g., voice) input/capture devices, video input/capture devices, haptic input devices, devices for receiving wired or wireless data transmissions, and the like). The simplified computing device 1300 of FIG. 13 may also include other optional components, such as, for example, one or more conventional computer output devices 1350 (e.g., display device(s) 1355, audio output devices, video output devices, devices for transmitting wired or wireless data transmissions, and the like). Exemplary types of input devices (herein also referred to as user interface modalities) and display devices that are operable with the video generation technique embodiments described herein have been described heretofore. Note that typical communications interfaces 1330, additional types of input and output devices 1340 and 1350, and storage devices 1360 for general-purpose computers are well known to those skilled in the art, and will not be described in detail herein.
  • The simplified computing device 1300 of FIG. 13 may also include a variety of computer readable media. Computer readable media can be any available media that can be accessed by the computer 1300 via storage devices 1360, and includes both volatile and nonvolatile media that is either removable 1370 and/or non-removable 1380, for storage of information such as computer-readable or computer-executable instructions, data structures, program modules, or other data. By way of example but not limitation, computer readable media may include computer storage media and communication media. Computer storage media includes, but is not limited to, computer or machine readable media or storage devices such as digital versatile disks (DVDs), compact discs (CDs), floppy disks, tape drives, hard drives, optical drives, solid state memory devices, random access memory (RAM), read-only memory (ROM), electrically erasable programmable read-only memory (EEPROM), flash memory or other memory technology, magnetic cassettes, magnetic tapes, magnetic disk storage, or other magnetic storage devices, or any other device which can be used to store the desired information and which can be accessed by one or more computing devices.
  • Storage of information such as computer-readable or computer-executable instructions, data structures, program modules, and the like, can also be accomplished by using any of a variety of the aforementioned communication media to encode one or more modulated data signals or carrier waves, or other transport mechanisms or communications protocols, and includes any wired or wireless information delivery mechanism. Note that the terms “modulated data signal” or “carrier wave” generally refer to a signal that has one or more of its characteristics set or changed in such a manner as to encode information in the signal. For example, communication media includes wired media such as a wired network or direct-wired connection carrying one or more modulated data signals, and wireless media such as acoustic, radio frequency (RF), infrared, laser, and other wireless media for transmitting and/or receiving one or more modulated data signals or carrier waves. Combinations of the any of the above should also be included within the scope of communication media.
  • Furthermore, software, programs, and/or computer program products embodying the some or all of the various embodiments of the video generation technique described herein, or portions thereof, may be stored, received, transmitted, or read from any desired combination of computer or machine readable media or storage devices and communication media in the form of computer executable instructions or other data structures.
  • Finally, the video generation technique embodiments described herein may be further described in the general context of computer-executable instructions, such as program modules, being executed by a computing device. Generally, program modules include routines, programs, objects, components, data structures, and the like, that perform particular tasks or implement particular abstract data types. The video generation technique embodiments may also be practiced in distributed computing environments where tasks are performed by one or more remote processing devices, or within a cloud of one or more devices, that are linked through one or more communications networks. In a distributed computing environment, program modules may be located in both local and remote computer storage media including media storage devices. Additionally, the aforementioned instructions may be implemented, in part or in whole, as hardware logic circuits, which may or may not include a processor.

Claims (20)

Wherefore, what is claimed is:
1. A computer-implemented process for generating a video of a scene, comprising:
using a computing device to perform the following process actions:
receiving one or more streams of sensor data that represent the scene; and
generating scene proxies from said streams of sensor data, said generation comprising the actions of:
generating a stream of mesh models of the scene from said streams of sensor data, and
for each of the mesh models,
slicing the mesh model using a series of planes that are parallel to each other, each of the planes in the series defining one or more contours each of which defines a specific region on the plane where the mesh model intersects the plane, and
generating a texture map for the mesh model which defines texture data corresponding to each of the contours that is defined by the series of planes.
2. The process of claim 1, wherein the texture map for the mesh model comprises a series of scanlines each of which corresponds to a different one of the planes in the series of planes, each of the scanlines comprises a series of texels, and the process action of generating a texture map for the mesh model which defines texture data corresponding to each of the contours that is defined by the series of planes comprises the actions of:
for each of the planes in the series of planes,
analyzing each of the contours that is defined by the plane in a prescribed order across the plane to identify a series of point locations along the contour, said analysis being performed starting from a prescribed zero position on the contour, said zero position being the same for each of the contours that is defined by the plane,
for each of the contours that is defined by the plane, using the series of point locations to determine a mathematical equation describing the contour,
assigning the texels in the scanline corresponding to the plane to the contours that are defined by the plane, said texel assignment being performed in the prescribed order across the plane, and
entering information specifying said texel assignment into the texture map for the mesh model; and
using the one or more streams of sensor data that represent the scene to compute texture data for each of the texels that is in the texture map for the mesh model; and
entering said computed texture data into the texture map for the mesh model.
3. The process of claim 2, wherein the process action of using the one or more streams of sensor data that represent the scene to compute texture data for each of the texels that is in the texture map for the mesh model comprises an action of, for each of said texels, using a projective texture mapping method to sample each of said streams of sensor data and combine texture information from each of said samples to generate texture data for the texel.
4. The process of claim 2, wherein the process action of assigning the texels in the scanline corresponding to the plane to the contours that are defined by the plane comprises the actions of:
for each of the contours that is defined by the plane,
calculating the length of the contour,
calculating the normalized length of the contour by dividing the length of the contour by the sum of the lengths of all of the contours that are defined by the plane,
calculating the number of texels in said scanline that are to be assigned to the contour by multiplying the normalized length of the contour and the total number of texels that is in said scanline, and
assigning said calculated number of texels to the contour.
5. The process of claim 1, wherein the texture data comprises one or more of: color data; or specular highlight data; or transparency data; or reflection data; or shadowing data.
6. The process of claim 1, wherein each of the mesh models comprises a collection of vertices, a prescribed spacing is used between successive planes in the series of planes, and said spacing is selected such that the series of planes intersects a maximum number of vertices in each of the mesh models.
7. The process of claim 1, wherein either,
the one or more streams of sensor data comprise a single stream of sensor data which represents the scene from a single geometric perspective, and the video being generated is a single viewpoint video, or
the one or more streams of sensor data comprise a plurality of streams of sensor data each of which represents the scene from a different geometric perspective, and the video being generated is a free viewpoint video.
8. The process of claim 1, further comprising an action of storing the scene proxies, said storing comprising the actions of:
for each of the mesh models,
storing a mathematical equation describing each of the contours that is defined by the series of planes,
storing data specifying which contours on neighboring planes in the series of planes correspond to each other, and
storing the texture map for the mesh model.
9. The process of claim 8, wherein the mathematical equation describing a given contour specifies either a polygon approximation of the contour, or a non-uniform rational basis spline curve approximation of the contour.
10. The process of claim 8 wherein,
whenever the spatial orientation of the series of planes is not pre-determined, the process action of storing the scene proxies further comprises an action of storing data specifying said spatial orientation, and
whenever the geometry of the series of planes is not pre-determined, the process action of storing the scene proxies further comprises an action of storing data specifying said geometry, said data comprising one or more of:
data specifying the number of planes in the series of planes; or
data specifying a prescribed spacing that is used between successive planes in the series of planes; or
data specifying the shape and dimensions of each of the planes in the series of planes.
11. The process of claim 8 wherein,
whenever the prescribed order across the plane is not pre-determined, the process action of storing the scene proxies further comprises an action of storing data specifying said order,
whenever the number of texels in each of the scanlines is not pre-determined, the process action of storing the scene proxies further comprises an action of storing data specifying said number, and
whenever the prescribed zero position on the contour is not pre-determined, the process action of storing the scene proxies further comprises an action of storing data specifying said zero position.
12. The process of claim 1, further comprising an action of distributing the scene proxies to an end user who either is, or will be, viewing the video on another computing device which is connected to a data communication network, said distribution comprising the actions of:
for each of the mesh models,
transmitting a mathematical equation describing each of the contours that is defined by the series of planes over the network to said other computing device,
transmitting data specifying which contours on neighboring planes in the series of planes correspond to each other over the network to said other computing device, and
transmitting the texture map for the mesh model over the network to said other computing device.
13. The process of claim 12, wherein whenever the spatial orientation of the series of planes is not pre-determined, the process action of distributing the scene proxies to an end user who either is, or will be, viewing the video on another computing device which is connected to a data communication network further comprises an action of transmitting data specifying said spatial orientation over the network to said other computing device.
14. The process of claim 12, wherein whenever the geometry of the series of planes is not pre-determined, the process action of distributing the scene proxies to an end user who either is, or will be, viewing the video on another computing device which is connected to a data communication network further comprises an action of transmitting data specifying said geometry over the network to said other computing device, said data comprising one or more of:
data specifying the number of planes in the series of planes; or
data specifying a prescribed spacing that is used between successive planes in the series of planes; or
data specifying the shape and dimensions of each of the planes in the series of planes.
15. The process of claim 12, wherein,
whenever the prescribed order across the plane is not pre-determined, the process action of distributing the scene proxies to an end user who either is, or will be, viewing the video on another computing device which is connected to a data communication network further comprises an action of transmitting data specifying said order over the network to said other computing device,
whenever the number of texels in each of the scanlines is not pre-determined, the process action of distributing the scene proxies to an end user who either is, or will be, viewing the video on another computing device which is connected to a data communication network further comprises an action of transmitting data specifying said number over the network to said other computing device, and
whenever the prescribed zero position on the contour is not pre-determined, the process action of distributing the scene proxies to an end user who either is, or will be, viewing the video on another computing device which is connected to a data communication network further comprises an action of transmitting data specifying said zero position over the network to said other computing device.
16. The process of claim 1, wherein the series of planes comprises either a horizontal spatial orientation or a vertical spatial orientation.
17. A computer-implemented process for presenting a video of a scene to a user, comprising:
using a computing device to perform the following process actions:
receiving scene proxies, said scene proxies comprising:
a stream of mathematical equations describing contours that are defined by a series of planes that are parallel to each other, and
a stream of texture maps defining texture data corresponding to each of the contours that is defined by the series of planes;
rendering images of the scene from the scene proxies, said rendering comprising the actions of:
constructing the series of planes using data specifying the spatial orientation and geometry of the series of planes,
constructing the contours that are defined by the series of planes using the stream of mathematical equations,
constructing a series of point locations along each of said contours, said construction being performed in a prescribed order across each of the planes in the series of planes, said construction also being performed starting from a prescribed zero position on each of said contours,
tessellating the point locations that are defined by the series of planes, said tessellation generating a stream of polygonal models, each polygonal model comprising a collection of polygonal faces that are formed by neighboring point locations on corresponding contours on neighboring planes in the series of planes,
sampling the stream of texture maps to identify the texture data that corresponds to each of the polygonal faces in the stream of polygonal models, and
using said identified texture data to add texture to each of the polygonal faces in the stream of polygonal models; and
displaying the images of the scene.
18. The process of claim 17, wherein each of the texture maps in the stream of texture maps comprises a series of scanlines each of which corresponds to a different one of the planes in the series of planes, each of the scanlines comprises a series of texels that are assigned to each of the contours that is defined by the plane corresponding to the scanline, and the process action of sampling the stream of texture maps to identify the texture data that corresponds to each of the polygonal faces in the stream of polygonal models comprises the actions of:
for each of the scanlines in each of the texture maps, adapting the number of texels in the scanline that are assigned to each one of the contours that is defined by the plane corresponding to the scanline to be the average of the number of texels in the scanline that are assigned to said one of the contours and the number of texels in the next scanline in the series of scanlines that are assigned to a contour that corresponds to said one of the contours, said adaption resulting in a modified version of each of the texture maps; and
sampling the modified version of each of the texture maps to identify the texture data that corresponds to each of the polygonal faces in the stream of polygonal models.
19. The process of claim 17, wherein the video being presented comprises one of:
asynchronous single viewpoint video; or
asynchronous free viewpoint video; or
unidirectional live single viewpoint video; or
unidirectional live free viewpoint video; or
bidirectional live single viewpoint video; or
bidirectional live free viewpoint video.
20. A computer-implemented process for generating a video of a scene, comprising:
using a computing device to perform the following process actions:
receiving one or more streams of sensor data that represent the scene;
generating scene proxies from said streams of sensor data, said scene proxies generation comprising the actions of:
generating a stream of mesh models of the scene from said streams of sensor data, and
for each of the mesh models,
slicing the mesh model using a series of planes that are parallel to each other, each of the planes in the series defining one or more contours each of which defines a specific region on the plane where the mesh model intersects the plane, and
generating a texture map for the mesh model which defines texture data corresponding to each of the contours that is defined by the series of planes, said texture map comprising a series of scanlines each of which corresponds to a different one of the planes in the series of planes, each of the scanlines comprising a series of texels, said texture map generation comprising the actions of,
for each of the planes in the series of planes,
 analyzing each of the contours that is defined by the plane in a prescribed order across the plane to identify a series of point locations along the contour,
 for each of the contours that is defined by the plane, using the series of point locations to determine a mathematical equation describing the contour,
 assigning the texels in the scanline corresponding to the plane to the contours that are defined by the plane, said texel assignment being performed in the prescribed order across the plane, and
 entering information specifying said texel assignment into the texture map for the mesh model,
using the one or more streams of sensor data that represent the scene to compute texture data for each of the texels that is in the texture map for the mesh model, and
entering said computed texture data into the texture map for the mesh model; and
distributing the scene proxies to an end user who either is, or will be, viewing the video on another computing device which is connected to a data communication network, said distribution comprising the actions of:
for each of the mesh models,
transmitting a mathematical equation describing each of the contours that is defined by the series of planes over the network to said other computing device,
transmitting data specifying which contours on neighboring planes in the series of planes correspond to each other over the network to said other computing device, and
transmitting the texture map for the mesh model over the network to said other computing device.
US13/790,158 2012-05-31 2013-03-08 Video generation using convict hulls Abandoned US20130321413A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US13/790,158 US20130321413A1 (en) 2012-05-31 2013-03-08 Video generation using convict hulls

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US201261653983P 2012-05-31 2012-05-31
US13/790,158 US20130321413A1 (en) 2012-05-31 2013-03-08 Video generation using convict hulls

Publications (1)

Publication Number Publication Date
US20130321413A1 true US20130321413A1 (en) 2013-12-05

Family

ID=49669652

Family Applications (10)

Application Number Title Priority Date Filing Date
US13/566,877 Active 2034-02-16 US9846960B2 (en) 2012-05-31 2012-08-03 Automated camera array calibration
US13/588,917 Abandoned US20130321586A1 (en) 2012-05-31 2012-08-17 Cloud based free viewpoint video streaming
US13/598,536 Abandoned US20130321593A1 (en) 2012-05-31 2012-08-29 View frustum culling for free viewpoint video (fvv)
US13/599,436 Active 2034-05-03 US9251623B2 (en) 2012-05-31 2012-08-30 Glancing angle exclusion
US13/599,170 Abandoned US20130321396A1 (en) 2012-05-31 2012-08-30 Multi-input free viewpoint video processing pipeline
US13/599,263 Active 2033-02-25 US8917270B2 (en) 2012-05-31 2012-08-30 Video generation using three-dimensional hulls
US13/599,678 Abandoned US20130321566A1 (en) 2012-05-31 2012-08-30 Audio source positioning using a camera
US13/598,747 Abandoned US20130321575A1 (en) 2012-05-31 2012-08-30 High definition bubbles for rendering free viewpoint video
US13/614,852 Active 2033-10-29 US9256980B2 (en) 2012-05-31 2012-09-13 Interpolating oriented disks in 3D space for constructing high fidelity geometric proxies from point clouds
US13/790,158 Abandoned US20130321413A1 (en) 2012-05-31 2013-03-08 Video generation using convict hulls

Family Applications Before (9)

Application Number Title Priority Date Filing Date
US13/566,877 Active 2034-02-16 US9846960B2 (en) 2012-05-31 2012-08-03 Automated camera array calibration
US13/588,917 Abandoned US20130321586A1 (en) 2012-05-31 2012-08-17 Cloud based free viewpoint video streaming
US13/598,536 Abandoned US20130321593A1 (en) 2012-05-31 2012-08-29 View frustum culling for free viewpoint video (fvv)
US13/599,436 Active 2034-05-03 US9251623B2 (en) 2012-05-31 2012-08-30 Glancing angle exclusion
US13/599,170 Abandoned US20130321396A1 (en) 2012-05-31 2012-08-30 Multi-input free viewpoint video processing pipeline
US13/599,263 Active 2033-02-25 US8917270B2 (en) 2012-05-31 2012-08-30 Video generation using three-dimensional hulls
US13/599,678 Abandoned US20130321566A1 (en) 2012-05-31 2012-08-30 Audio source positioning using a camera
US13/598,747 Abandoned US20130321575A1 (en) 2012-05-31 2012-08-30 High definition bubbles for rendering free viewpoint video
US13/614,852 Active 2033-10-29 US9256980B2 (en) 2012-05-31 2012-09-13 Interpolating oriented disks in 3D space for constructing high fidelity geometric proxies from point clouds

Country Status (1)

Country Link
US (10) US9846960B2 (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180176533A1 (en) * 2015-06-11 2018-06-21 Conti Temic Microelectronic Gmbh Method for generating a virtual image of vehicle surroundings
US10681337B2 (en) * 2017-04-14 2020-06-09 Fujitsu Limited Method, apparatus, and non-transitory computer-readable storage medium for view point selection assistance in free viewpoint video generation
WO2021063271A1 (en) * 2019-09-30 2021-04-08 Oppo广东移动通信有限公司 Human body model reconstruction method and reconstruction system, and storage medium
US20210409669A1 (en) * 2018-11-21 2021-12-30 Boe Technology Group Co., Ltd. A method for generating and displaying panorama images based on rendering engine and a display apparatus

Families Citing this family (239)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP5001286B2 (en) * 2005-10-11 2012-08-15 プライム センス リミティド Object reconstruction method and system
US11792538B2 (en) 2008-05-20 2023-10-17 Adeia Imaging Llc Capturing and processing of images including occlusions focused on an image sensor by a lens stack array
US8866920B2 (en) 2008-05-20 2014-10-21 Pelican Imaging Corporation Capturing and processing of images using monolithic camera array with heterogeneous imagers
US20150373153A1 (en) 2010-06-30 2015-12-24 Primal Space Systems, Inc. System and method to reduce bandwidth requirement for visibility event packet streaming using a predicted maximal view frustum and predicted maximal viewpoint extent, each computed at runtime
US9892546B2 (en) * 2010-06-30 2018-02-13 Primal Space Systems, Inc. Pursuit path camera model method and system
US8878950B2 (en) 2010-12-14 2014-11-04 Pelican Imaging Corporation Systems and methods for synthesizing high resolution images using super-resolution processes
EP2761534B1 (en) 2011-09-28 2020-11-18 FotoNation Limited Systems for encoding light field image files
US9001960B2 (en) * 2012-01-04 2015-04-07 General Electric Company Method and apparatus for reducing noise-related imaging artifacts
US9300841B2 (en) * 2012-06-25 2016-03-29 Yoldas Askan Method of generating a smooth image from point cloud data
CN107346061B (en) 2012-08-21 2020-04-24 快图有限公司 System and method for parallax detection and correction in images captured using an array camera
US10079968B2 (en) 2012-12-01 2018-09-18 Qualcomm Incorporated Camera having additional functionality based on connectivity with a host device
US9519968B2 (en) * 2012-12-13 2016-12-13 Hewlett-Packard Development Company, L.P. Calibrating visual sensors using homography operators
US9224227B2 (en) * 2012-12-21 2015-12-29 Nvidia Corporation Tile shader for screen space, a method of rendering and a graphics processing unit employing the tile shader
US8866912B2 (en) 2013-03-10 2014-10-21 Pelican Imaging Corporation System and methods for calibration of an array camera using a single captured image
US9144905B1 (en) * 2013-03-13 2015-09-29 Hrl Laboratories, Llc Device and method to identify functional parts of tools for robotic manipulation
US9578259B2 (en) 2013-03-14 2017-02-21 Fotonation Cayman Limited Systems and methods for reducing motion blur in images or video in ultra low light with array cameras
US9445003B1 (en) * 2013-03-15 2016-09-13 Pelican Imaging Corporation Systems and methods for synthesizing high resolution images using image deconvolution based on motion and depth information
WO2014162824A1 (en) * 2013-04-04 2014-10-09 ソニー株式会社 Display control device, display control method and program
US9191643B2 (en) 2013-04-15 2015-11-17 Microsoft Technology Licensing, Llc Mixing infrared and color component data point clouds
US10262462B2 (en) 2014-04-18 2019-04-16 Magic Leap, Inc. Systems and methods for augmented and virtual reality
US9208609B2 (en) * 2013-07-01 2015-12-08 Mitsubishi Electric Research Laboratories, Inc. Method for fitting primitive shapes to 3D point clouds using distance fields
EP3022898B1 (en) * 2013-07-19 2020-04-15 Google Technology Holdings LLC Asymmetric sensor array for capturing images
US10140751B2 (en) * 2013-08-08 2018-11-27 Imagination Technologies Limited Normal offset smoothing
CN104424655A (en) * 2013-09-10 2015-03-18 鸿富锦精密工业(深圳)有限公司 System and method for reconstructing point cloud curved surface
JP6476658B2 (en) * 2013-09-11 2019-03-06 ソニー株式会社 Image processing apparatus and method
US9286718B2 (en) * 2013-09-27 2016-03-15 Ortery Technologies, Inc. Method using 3D geometry data for virtual reality image presentation and control in 3D space
US10591969B2 (en) 2013-10-25 2020-03-17 Google Technology Holdings LLC Sensor-based near-field communication authentication
US9836885B1 (en) 2013-10-25 2017-12-05 Appliance Computing III, Inc. Image-based rendering of real spaces
US9888333B2 (en) * 2013-11-11 2018-02-06 Google Technology Holdings LLC Three-dimensional audio rendering techniques
US10119808B2 (en) 2013-11-18 2018-11-06 Fotonation Limited Systems and methods for estimating depth from projected texture using camera arrays
EP3075140B1 (en) 2013-11-26 2018-06-13 FotoNation Cayman Limited Array camera configurations incorporating multiple constituent array cameras
EP2881918B1 (en) * 2013-12-06 2018-02-07 My Virtual Reality Software AS Method for visualizing three-dimensional data
US9233469B2 (en) * 2014-02-13 2016-01-12 GM Global Technology Operations LLC Robotic system with 3D box location functionality
US9530226B2 (en) * 2014-02-18 2016-12-27 Par Technology Corporation Systems and methods for optimizing N dimensional volume data for transmission
EP3111299A4 (en) 2014-02-28 2017-11-22 Hewlett-Packard Development Company, L.P. Calibration of sensors and projector
US9396586B2 (en) * 2014-03-14 2016-07-19 Matterport, Inc. Processing and/or transmitting 3D data
US10600245B1 (en) * 2014-05-28 2020-03-24 Lucasfilm Entertainment Company Ltd. Navigating a virtual environment of a media content item
CN104089628B (en) * 2014-06-30 2017-02-08 中国科学院光电研究院 Self-adaption geometric calibration method of light field camera
US11051000B2 (en) 2014-07-14 2021-06-29 Mitsubishi Electric Research Laboratories, Inc. Method for calibrating cameras with non-overlapping views
US10169909B2 (en) * 2014-08-07 2019-01-01 Pixar Generating a volumetric projection for an object
US11205305B2 (en) 2014-09-22 2021-12-21 Samsung Electronics Company, Ltd. Presentation of three-dimensional video
US10257494B2 (en) 2014-09-22 2019-04-09 Samsung Electronics Co., Ltd. Reconstruction of three-dimensional video
CN113256730B (en) 2014-09-29 2023-09-05 快图有限公司 System and method for dynamic calibration of an array camera
US9600892B2 (en) * 2014-11-06 2017-03-21 Symbol Technologies, Llc Non-parametric method of and system for estimating dimensions of objects of arbitrary shape
EP3221851A1 (en) * 2014-11-20 2017-09-27 Cappasity Inc. Systems and methods for 3d capture of objects using multiple range cameras and multiple rgb cameras
US9396554B2 (en) 2014-12-05 2016-07-19 Symbol Technologies, Llc Apparatus for and method of estimating dimensions of an object associated with a code in automatic response to reading the code
DE102014118989A1 (en) * 2014-12-18 2016-06-23 Connaught Electronics Ltd. Method for calibrating a camera system, camera system and motor vehicle
US11019330B2 (en) 2015-01-19 2021-05-25 Aquifi, Inc. Multiple camera system with auto recalibration
US9686520B2 (en) * 2015-01-22 2017-06-20 Microsoft Technology Licensing, Llc Reconstructing viewport upon user viewpoint misprediction
US9661312B2 (en) * 2015-01-22 2017-05-23 Microsoft Technology Licensing, Llc Synthesizing second eye viewport using interleaving
EP3254435B1 (en) * 2015-02-03 2020-08-26 Dolby Laboratories Licensing Corporation Post-conference playback system having higher perceived quality than originally heard in the conference
CN107431801A (en) * 2015-03-01 2017-12-01 奈克斯特Vr股份有限公司 The method and apparatus for support content generation, sending and/or resetting
EP3070942B1 (en) * 2015-03-17 2023-11-22 InterDigital CE Patent Holdings Method and apparatus for displaying light field video data
US10878278B1 (en) * 2015-05-16 2020-12-29 Sturfee, Inc. Geo-localization based on remotely sensed visual features
US9460513B1 (en) 2015-06-17 2016-10-04 Mitsubishi Electric Research Laboratories, Inc. Method for reconstructing a 3D scene as a 3D model using images acquired by 3D sensors and omnidirectional cameras
US10554713B2 (en) 2015-06-19 2020-02-04 Microsoft Technology Licensing, Llc Low latency application streaming using temporal frame transformation
KR101835434B1 (en) * 2015-07-08 2018-03-09 고려대학교 산학협력단 Method and Apparatus for generating a protection image, Method for mapping between image pixel and depth value
US9848212B2 (en) * 2015-07-10 2017-12-19 Futurewei Technologies, Inc. Multi-view video streaming with fast and smooth view switch
WO2017030985A1 (en) 2015-08-14 2017-02-23 Pcms Holdings, Inc. System and method for augmented reality multi-view telepresence
GB2543776B (en) * 2015-10-27 2019-02-06 Imagination Tech Ltd Systems and methods for processing images of objects
US10812778B1 (en) 2015-11-09 2020-10-20 Cognex Corporation System and method for calibrating one or more 3D sensors mounted on a moving manipulator
US20180374239A1 (en) * 2015-11-09 2018-12-27 Cognex Corporation System and method for field calibration of a vision system imaging two opposite sides of a calibration object
US10757394B1 (en) * 2015-11-09 2020-08-25 Cognex Corporation System and method for calibrating a plurality of 3D sensors with respect to a motion conveyance
US11562502B2 (en) * 2015-11-09 2023-01-24 Cognex Corporation System and method for calibrating a plurality of 3D sensors with respect to a motion conveyance
CN108369639B (en) * 2015-12-11 2022-06-21 虞晶怡 Image-based image rendering method and system using multiple cameras and depth camera array
US10352689B2 (en) 2016-01-28 2019-07-16 Symbol Technologies, Llc Methods and systems for high precision locationing with depth values
US10145955B2 (en) 2016-02-04 2018-12-04 Symbol Technologies, Llc Methods and systems for processing point-cloud data with a line scanner
KR20170095030A (en) * 2016-02-12 2017-08-22 삼성전자주식회사 Scheme for supporting virtual reality content display in communication system
CN107097698B (en) * 2016-02-22 2021-10-01 福特环球技术公司 Inflatable airbag system for a vehicle seat, seat assembly and method for adjusting the same
US11573325B2 (en) 2016-03-11 2023-02-07 Kaarta, Inc. Systems and methods for improvements in scanning and mapping
US11567201B2 (en) 2016-03-11 2023-01-31 Kaarta, Inc. Laser scanner with real-time, online ego-motion estimation
US10989542B2 (en) 2016-03-11 2021-04-27 Kaarta, Inc. Aligning measured signal data with slam localization data and uses thereof
EP3427008B1 (en) 2016-03-11 2022-09-07 Kaarta, Inc. Laser scanner with real-time, online ego-motion estimation
US10721451B2 (en) 2016-03-23 2020-07-21 Symbol Technologies, Llc Arrangement for, and method of, loading freight into a shipping container
CA2961921C (en) 2016-03-29 2020-05-12 Institut National D'optique Camera calibration method using a calibration target
WO2017172528A1 (en) 2016-04-01 2017-10-05 Pcms Holdings, Inc. Apparatus and method for supporting interactive augmented reality functionalities
US9805240B1 (en) 2016-04-18 2017-10-31 Symbol Technologies, Llc Barcode scanning and dimensioning
CN107341768B (en) * 2016-04-29 2022-03-11 微软技术许可有限责任公司 Grid noise reduction
US10376320B2 (en) 2016-05-11 2019-08-13 Affera, Inc. Anatomical model generation
WO2017197247A2 (en) 2016-05-12 2017-11-16 Affera, Inc. Anatomical model controlling
EP3264759A1 (en) 2016-06-30 2018-01-03 Thomson Licensing An apparatus and a method for generating data representative of a pixel beam
US10192345B2 (en) * 2016-07-19 2019-01-29 Qualcomm Incorporated Systems and methods for improved surface normal estimation
US11082471B2 (en) * 2016-07-27 2021-08-03 R-Stor Inc. Method and apparatus for bonding communication technologies
US10574909B2 (en) 2016-08-08 2020-02-25 Microsoft Technology Licensing, Llc Hybrid imaging sensor for structured light object capture
US10776661B2 (en) 2016-08-19 2020-09-15 Symbol Technologies, Llc Methods, systems and apparatus for segmenting and dimensioning objects
US9980078B2 (en) 2016-10-14 2018-05-22 Nokia Technologies Oy Audio object modification in free-viewpoint rendering
US10229533B2 (en) * 2016-11-03 2019-03-12 Mitsubishi Electric Research Laboratories, Inc. Methods and systems for fast resampling method and apparatus for point cloud data
US11042161B2 (en) 2016-11-16 2021-06-22 Symbol Technologies, Llc Navigation control method and apparatus in a mobile automation system
US10451405B2 (en) 2016-11-22 2019-10-22 Symbol Technologies, Llc Dimensioning system for, and method of, dimensioning freight in motion along an unconstrained path in a venue
JP6948171B2 (en) * 2016-11-30 2021-10-13 キヤノン株式会社 Image processing equipment and image processing methods, programs
WO2018100928A1 (en) 2016-11-30 2018-06-07 キヤノン株式会社 Image processing device and method
EP3336801A1 (en) * 2016-12-19 2018-06-20 Thomson Licensing Method and apparatus for constructing lighting environment representations of 3d scenes
US10354411B2 (en) 2016-12-20 2019-07-16 Symbol Technologies, Llc Methods, systems and apparatus for segmenting objects
EP3565259A1 (en) * 2016-12-28 2019-11-06 Panasonic Intellectual Property Corporation of America Three-dimensional model distribution method, three-dimensional model receiving method, three-dimensional model distribution device, and three-dimensional model receiving device
US11096004B2 (en) 2017-01-23 2021-08-17 Nokia Technologies Oy Spatial audio rendering point extension
US11665308B2 (en) 2017-01-31 2023-05-30 Tetavi, Ltd. System and method for rendering free viewpoint video for sport applications
JP7159057B2 (en) * 2017-02-10 2022-10-24 パナソニック インテレクチュアル プロパティ コーポレーション オブ アメリカ Free-viewpoint video generation method and free-viewpoint video generation system
JP7086522B2 (en) * 2017-02-28 2022-06-20 キヤノン株式会社 Image processing equipment, information processing methods and programs
US10531219B2 (en) 2017-03-20 2020-01-07 Nokia Technologies Oy Smooth rendering of overlapping audio-object interactions
WO2018172614A1 (en) 2017-03-22 2018-09-27 Nokia Technologies Oy A method and an apparatus and a computer program product for adaptive streaming
US10726574B2 (en) 2017-04-11 2020-07-28 Dolby Laboratories Licensing Corporation Passive multi-wearable-devices tracking
US10939038B2 (en) * 2017-04-24 2021-03-02 Intel Corporation Object pre-encoding for 360-degree view for optimal quality and latency
US10726273B2 (en) 2017-05-01 2020-07-28 Symbol Technologies, Llc Method and apparatus for shelf feature and object placement detection from shelf images
US10591918B2 (en) 2017-05-01 2020-03-17 Symbol Technologies, Llc Fixed segmented lattice planning for a mobile automation apparatus
US11367092B2 (en) 2017-05-01 2022-06-21 Symbol Technologies, Llc Method and apparatus for extracting and processing price text from an image set
US10663590B2 (en) 2017-05-01 2020-05-26 Symbol Technologies, Llc Device and method for merging lidar data
US10949798B2 (en) 2017-05-01 2021-03-16 Symbol Technologies, Llc Multimodal localization and mapping for a mobile automation apparatus
US11093896B2 (en) 2017-05-01 2021-08-17 Symbol Technologies, Llc Product status detection system
US11449059B2 (en) 2017-05-01 2022-09-20 Symbol Technologies, Llc Obstacle detection for a mobile automation apparatus
US11074036B2 (en) 2017-05-05 2021-07-27 Nokia Technologies Oy Metadata-free audio-object interactions
US11600084B2 (en) 2017-05-05 2023-03-07 Symbol Technologies, Llc Method and apparatus for detecting and interpreting price label text
CN108881784B (en) * 2017-05-12 2020-07-03 腾讯科技(深圳)有限公司 Virtual scene implementation method and device, terminal and server
US10165386B2 (en) 2017-05-16 2018-12-25 Nokia Technologies Oy VR audio superzoom
US10154176B1 (en) * 2017-05-30 2018-12-11 Intel Corporation Calibrating depth cameras using natural objects with expected shapes
CN110476186B (en) * 2017-06-07 2020-12-29 谷歌有限责任公司 High speed high fidelity face tracking
WO2018226508A1 (en) 2017-06-09 2018-12-13 Pcms Holdings, Inc. Spatially faithful telepresence supporting varying geometries and moving users
BR102017012517A2 (en) * 2017-06-12 2018-12-26 Samsung Eletrônica da Amazônia Ltda. method for 360 ° media display or bubble interface
CN110832553A (en) 2017-06-29 2020-02-21 索尼公司 Image processing apparatus, image processing method, and program
JP6948175B2 (en) 2017-07-06 2021-10-13 キヤノン株式会社 Image processing device and its control method
US11049218B2 (en) 2017-08-11 2021-06-29 Samsung Electronics Company, Ltd. Seamless image stitching
EP3669330A4 (en) * 2017-08-15 2021-04-07 Nokia Technologies Oy Encoding and decoding of volumetric video
US11405643B2 (en) 2017-08-15 2022-08-02 Nokia Technologies Oy Sequential encoding and decoding of volumetric video
US11290758B2 (en) * 2017-08-30 2022-03-29 Samsung Electronics Co., Ltd. Method and apparatus of point-cloud streaming
JP6409107B1 (en) 2017-09-06 2018-10-17 キヤノン株式会社 Information processing apparatus, information processing method, and program
US10572763B2 (en) 2017-09-07 2020-02-25 Symbol Technologies, Llc Method and apparatus for support surface edge detection
US10521914B2 (en) 2017-09-07 2019-12-31 Symbol Technologies, Llc Multi-sensor object recognition system and method
US11818401B2 (en) 2017-09-14 2023-11-14 Apple Inc. Point cloud geometry compression using octrees and binary arithmetic encoding with adaptive look-up tables
US10861196B2 (en) * 2017-09-14 2020-12-08 Apple Inc. Point cloud compression
US10897269B2 (en) 2017-09-14 2021-01-19 Apple Inc. Hierarchical point cloud compression
US10909725B2 (en) 2017-09-18 2021-02-02 Apple Inc. Point cloud compression
US11113845B2 (en) 2017-09-18 2021-09-07 Apple Inc. Point cloud compression using non-cubic projections and masks
JP6433559B1 (en) 2017-09-19 2018-12-05 キヤノン株式会社 Providing device, providing method, and program
CN107610182B (en) * 2017-09-22 2018-09-11 哈尔滨工业大学 A kind of scaling method at light-field camera microlens array center
JP6425780B1 (en) 2017-09-22 2018-11-21 キヤノン株式会社 Image processing system, image processing apparatus, image processing method and program
US11395087B2 (en) 2017-09-29 2022-07-19 Nokia Technologies Oy Level-based audio-object interactions
EP3467777A1 (en) * 2017-10-06 2019-04-10 Thomson Licensing A method and apparatus for encoding/decoding the colors of a point cloud representing a 3d object
WO2019099605A1 (en) 2017-11-17 2019-05-23 Kaarta, Inc. Methods and systems for geo-referencing mapping systems
US10607373B2 (en) 2017-11-22 2020-03-31 Apple Inc. Point cloud compression with closed-loop color conversion
US10951879B2 (en) 2017-12-04 2021-03-16 Canon Kabushiki Kaisha Method, system and apparatus for capture of image data for free viewpoint video
US11430412B2 (en) * 2017-12-19 2022-08-30 Sony Interactive Entertainment Inc. Freely selected point of view image generating apparatus, reference image data generating apparatus, freely selected point of view image generating method, and reference image data generating method
KR102334070B1 (en) 2018-01-18 2021-12-03 삼성전자주식회사 Electric apparatus and method for control thereof
WO2019151569A1 (en) * 2018-01-30 2019-08-08 가이아쓰리디 주식회사 Method for providing three-dimensional geographic information system web service
US10417806B2 (en) * 2018-02-15 2019-09-17 JJK Holdings, LLC Dynamic local temporal-consistent textured mesh compression
JP2019144958A (en) * 2018-02-22 2019-08-29 キヤノン株式会社 Image processing device, image processing method, and program
WO2019165194A1 (en) 2018-02-23 2019-08-29 Kaarta, Inc. Methods and systems for processing and colorizing point clouds and meshes
US10542368B2 (en) 2018-03-27 2020-01-21 Nokia Technologies Oy Audio content modification for playback audio
US11308577B2 (en) * 2018-04-04 2022-04-19 Sony Interactive Entertainment Inc. Reference image generation apparatus, display image generation apparatus, reference image generation method, and display image generation method
US11327504B2 (en) 2018-04-05 2022-05-10 Symbol Technologies, Llc Method, system and apparatus for mobile automation apparatus localization
US10740911B2 (en) 2018-04-05 2020-08-11 Symbol Technologies, Llc Method, system and apparatus for correcting translucency artifacts in data representing a support structure
US10823572B2 (en) 2018-04-05 2020-11-03 Symbol Technologies, Llc Method, system and apparatus for generating navigational data
US10832436B2 (en) 2018-04-05 2020-11-10 Symbol Technologies, Llc Method, system and apparatus for recovering label positions
US10809078B2 (en) 2018-04-05 2020-10-20 Symbol Technologies, Llc Method, system and apparatus for dynamic path generation
US10909727B2 (en) 2018-04-10 2021-02-02 Apple Inc. Hierarchical point cloud compression with smoothing
US10939129B2 (en) 2018-04-10 2021-03-02 Apple Inc. Point cloud compression
US11010928B2 (en) 2018-04-10 2021-05-18 Apple Inc. Adaptive distance based point cloud compression
US10909726B2 (en) 2018-04-10 2021-02-02 Apple Inc. Point cloud compression
US11017566B1 (en) 2018-07-02 2021-05-25 Apple Inc. Point cloud compression with adaptive filtering
WO2020009826A1 (en) 2018-07-05 2020-01-09 Kaarta, Inc. Methods and systems for auto-leveling of point clouds and 3d models
US11202098B2 (en) 2018-07-05 2021-12-14 Apple Inc. Point cloud compression with multi-resolution video encoding
US11012713B2 (en) 2018-07-12 2021-05-18 Apple Inc. Bit stream structure for compressed point cloud data
US11367224B2 (en) 2018-10-02 2022-06-21 Apple Inc. Occupancy map block-to-patch information compression
US11506483B2 (en) 2018-10-05 2022-11-22 Zebra Technologies Corporation Method, system and apparatus for support structure depth determination
US11010920B2 (en) 2018-10-05 2021-05-18 Zebra Technologies Corporation Method, system and apparatus for object detection in point clouds
US10972835B2 (en) * 2018-11-01 2021-04-06 Sennheiser Electronic Gmbh & Co. Kg Conference system with a microphone array system and a method of speech acquisition in a conference system
US11090811B2 (en) 2018-11-13 2021-08-17 Zebra Technologies Corporation Method and apparatus for labeling of support structures
US11003188B2 (en) 2018-11-13 2021-05-11 Zebra Technologies Corporation Method, system and apparatus for obstacle handling in navigational path generation
US11416000B2 (en) 2018-12-07 2022-08-16 Zebra Technologies Corporation Method and apparatus for navigational ray tracing
CN109618122A (en) * 2018-12-07 2019-04-12 合肥万户网络技术有限公司 A kind of virtual office conference system
US11079240B2 (en) 2018-12-07 2021-08-03 Zebra Technologies Corporation Method, system and apparatus for adaptive particle filter localization
US11100303B2 (en) 2018-12-10 2021-08-24 Zebra Technologies Corporation Method, system and apparatus for auxiliary label detection and association
US11423572B2 (en) 2018-12-12 2022-08-23 Analog Devices, Inc. Built-in calibration of time-of-flight depth imaging systems
US11015938B2 (en) 2018-12-12 2021-05-25 Zebra Technologies Corporation Method, system and apparatus for navigational assistance
WO2020122675A1 (en) * 2018-12-13 2020-06-18 삼성전자주식회사 Method, device, and computer-readable recording medium for compressing 3d mesh content
US10731970B2 (en) 2018-12-13 2020-08-04 Zebra Technologies Corporation Method, system and apparatus for support structure detection
US10818077B2 (en) 2018-12-14 2020-10-27 Canon Kabushiki Kaisha Method, system and apparatus for controlling a virtual camera
CA3028708A1 (en) 2018-12-28 2020-06-28 Zih Corp. Method, system and apparatus for dynamic loop closure in mapping trajectories
JP7211835B2 (en) * 2019-02-04 2023-01-24 i-PRO株式会社 IMAGING SYSTEM AND SYNCHRONIZATION CONTROL METHOD
WO2020164044A1 (en) * 2019-02-14 2020-08-20 北京大学深圳研究生院 Free-viewpoint image synthesis method, device, and apparatus
JP6647433B1 (en) * 2019-02-19 2020-02-14 株式会社メディア工房 Point cloud data communication system, point cloud data transmission device, and point cloud data transmission method
US10797090B2 (en) 2019-02-27 2020-10-06 Semiconductor Components Industries, Llc Image sensor with near-infrared and visible light phase detection pixels
WO2020181112A1 (en) 2019-03-07 2020-09-10 Alibaba Group Holding Limited Video generating method, apparatus, medium, and terminal
US11057564B2 (en) 2019-03-28 2021-07-06 Apple Inc. Multiple layer flexure for supporting a moving image sensor
JP2020173629A (en) * 2019-04-11 2020-10-22 キヤノン株式会社 Image processing system, virtual viewpoint video generation system, and control method and program of image processing system
US11662739B2 (en) 2019-06-03 2023-05-30 Zebra Technologies Corporation Method, system and apparatus for adaptive ceiling-based localization
US11151743B2 (en) 2019-06-03 2021-10-19 Zebra Technologies Corporation Method, system and apparatus for end of aisle detection
US11341663B2 (en) 2019-06-03 2022-05-24 Zebra Technologies Corporation Method, system and apparatus for detecting support structure obstructions
US11200677B2 (en) 2019-06-03 2021-12-14 Zebra Technologies Corporation Method, system and apparatus for shelf edge detection
US11080566B2 (en) 2019-06-03 2021-08-03 Zebra Technologies Corporation Method, system and apparatus for gap detection in support structures with peg regions
US11402846B2 (en) 2019-06-03 2022-08-02 Zebra Technologies Corporation Method, system and apparatus for mitigating data capture light leakage
US11711544B2 (en) 2019-07-02 2023-07-25 Apple Inc. Point cloud compression with supplemental information messages
CN110624220B (en) * 2019-09-04 2021-05-04 福建师范大学 Method for obtaining optimal standing long jump technical template
MX2022003020A (en) 2019-09-17 2022-06-14 Boston Polarimetrics Inc Systems and methods for surface modeling using polarization cues.
US11627314B2 (en) 2019-09-27 2023-04-11 Apple Inc. Video-based point cloud compression with non-normative smoothing
US11562507B2 (en) 2019-09-27 2023-01-24 Apple Inc. Point cloud compression using video encoding with time consistent patches
US11538196B2 (en) 2019-10-02 2022-12-27 Apple Inc. Predictive coding for point cloud compression
US11895307B2 (en) 2019-10-04 2024-02-06 Apple Inc. Block-based predictive coding for point cloud compression
EP4042366A4 (en) 2019-10-07 2023-11-15 Boston Polarimetrics, Inc. Systems and methods for augmentation of sensor systems and imaging systems with polarization
US11315326B2 (en) * 2019-10-15 2022-04-26 At&T Intellectual Property I, L.P. Extended reality anchor caching based on viewport prediction
US11202162B2 (en) 2019-10-18 2021-12-14 Msg Entertainment Group, Llc Synthesizing audio of a venue
US20210120356A1 (en) * 2019-10-18 2021-04-22 Msg Entertainment Group, Llc Mapping Audio To Visual Images on a Display Device Having a Curved Screen
CN110769241B (en) * 2019-11-05 2022-02-01 广州虎牙科技有限公司 Video frame processing method and device, user side and storage medium
KR20230116068A (en) 2019-11-30 2023-08-03 보스턴 폴라리메트릭스, 인크. System and method for segmenting transparent objects using polarization signals
US11507103B2 (en) 2019-12-04 2022-11-22 Zebra Technologies Corporation Method, system and apparatus for localization-based historical obstacle handling
US11107238B2 (en) 2019-12-13 2021-08-31 Zebra Technologies Corporation Method, system and apparatus for detecting item facings
US11734873B2 (en) 2019-12-13 2023-08-22 Sony Group Corporation Real-time volumetric visualization of 2-D images
US11798196B2 (en) 2020-01-08 2023-10-24 Apple Inc. Video-based point cloud compression with predicted patches
US11625866B2 (en) 2020-01-09 2023-04-11 Apple Inc. Geometry encoding using octrees and predictive trees
CN115552486A (en) 2020-01-29 2022-12-30 因思创新有限责任公司 System and method for characterizing an object pose detection and measurement system
KR20220133973A (en) 2020-01-30 2022-10-05 인트린식 이노베이션 엘엘씨 Systems and methods for synthesizing data to train statistical models for different imaging modalities, including polarized images
US11240465B2 (en) 2020-02-21 2022-02-01 Alibaba Group Holding Limited System and method to use decoder information in video super resolution
US11430179B2 (en) * 2020-02-24 2022-08-30 Microsoft Technology Licensing, Llc Depth buffer dilation for remote rendering
US11822333B2 (en) 2020-03-30 2023-11-21 Zebra Technologies Corporation Method, system and apparatus for data capture illumination control
WO2021243088A1 (en) 2020-05-27 2021-12-02 Boston Polarimetrics, Inc. Multi-aperture polarization optical systems using beam splitters
US11776205B2 (en) * 2020-06-09 2023-10-03 Ptc Inc. Determination of interactions with predefined volumes of space based on automated analysis of volumetric video
US11620768B2 (en) 2020-06-24 2023-04-04 Apple Inc. Point cloud geometry compression using octrees with multiple scan orders
US11615557B2 (en) 2020-06-24 2023-03-28 Apple Inc. Point cloud compression using octrees with slicing
US11450024B2 (en) 2020-07-17 2022-09-20 Zebra Technologies Corporation Mixed depth object detection
US11875452B2 (en) * 2020-08-18 2024-01-16 Qualcomm Incorporated Billboard layers in object-space rendering
US11748918B1 (en) * 2020-09-25 2023-09-05 Apple Inc. Synthesized camera arrays for rendering novel viewpoints
JP7386888B2 (en) * 2020-10-08 2023-11-27 グーグル エルエルシー Two-shot composition of the speaker on the screen
US11593915B2 (en) 2020-10-21 2023-02-28 Zebra Technologies Corporation Parallax-tolerant panoramic image generation
US11392891B2 (en) 2020-11-03 2022-07-19 Zebra Technologies Corporation Item placement detection and optimization in material handling systems
US11847832B2 (en) 2020-11-11 2023-12-19 Zebra Technologies Corporation Object classification for autonomous navigation systems
US11527014B2 (en) * 2020-11-24 2022-12-13 Verizon Patent And Licensing Inc. Methods and systems for calibrating surface data capture devices
US11874415B2 (en) * 2020-12-22 2024-01-16 International Business Machines Corporation Earthquake detection and response via distributed visual input
US11703457B2 (en) * 2020-12-29 2023-07-18 Industrial Technology Research Institute Structure diagnosis system and structure diagnosis method
US11651538B2 (en) * 2021-03-17 2023-05-16 International Business Machines Corporation Generating 3D videos from 2D models
US11948338B1 (en) 2021-03-29 2024-04-02 Apple Inc. 3D volumetric content encoding using 2D videos and simplified 3D meshes
US11290658B1 (en) 2021-04-15 2022-03-29 Boston Polarimetrics, Inc. Systems and methods for camera exposure control
US11954886B2 (en) 2021-04-15 2024-04-09 Intrinsic Innovation Llc Systems and methods for six-degree of freedom pose estimation of deformable objects
US11954882B2 (en) 2021-06-17 2024-04-09 Zebra Technologies Corporation Feature-based georegistration for mobile computing devices
US11689813B2 (en) 2021-07-01 2023-06-27 Intrinsic Innovation Llc Systems and methods for high dynamic range imaging using crossed polarizers
CN113761238B (en) * 2021-08-27 2022-08-23 广州文远知行科技有限公司 Point cloud storage method, device, equipment and storage medium
US11887245B2 (en) * 2021-09-02 2024-01-30 Nvidia Corporation Techniques for rendering signed distance functions
CN113905221B (en) * 2021-09-30 2024-01-16 福州大学 Stereoscopic panoramic video asymmetric transport stream self-adaption method and system
CN114355287B (en) * 2022-01-04 2023-08-15 湖南大学 Ultra-short baseline underwater sound distance measurement method and system
WO2023159180A1 (en) * 2022-02-17 2023-08-24 Nutech Ventures Single-pass 3d reconstruction of internal surface of pipelines using depth camera array
CN116800947A (en) * 2022-03-16 2023-09-22 安霸国际有限合伙企业 Rapid RGB-IR calibration verification for mass production process
WO2024006997A1 (en) * 2022-07-01 2024-01-04 Google Llc Three-dimensional video highlight from a camera source

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5602903A (en) * 1994-09-28 1997-02-11 Us West Technologies, Inc. Positioning system and method
US6163337A (en) * 1996-04-05 2000-12-19 Matsushita Electric Industrial Co., Ltd. Multi-view point image transmission method and multi-view point image display method
US20070124015A1 (en) * 2005-11-30 2007-05-31 Tian Chen System and method for extracting parameters of a cutting tool
US20070153000A1 (en) * 2005-11-29 2007-07-05 Siemens Corporate Research Inc Method and Apparatus for Discrete Mesh Filleting and Rounding Through Ball Pivoting
US20090262977A1 (en) * 2008-04-18 2009-10-22 Cheng-Ming Huang Visual tracking system and method thereof
US20110282473A1 (en) * 2008-04-30 2011-11-17 Otismed Corporation System and method for image segmentation in generating computer models of a joint to undergo arthroplasty

Family Cites Families (103)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6327381B1 (en) 1994-12-29 2001-12-04 Worldscape, Llc Image transformation and synthesis methods
US5850352A (en) 1995-03-31 1998-12-15 The Regents Of The University Of California Immersive video, including video hypermosaicing to generate from multiple video views of a scene a three-dimensional video mosaic from which diverse virtual video scene images are synthesized, including panoramic, scene interactive and stereoscopic images
JP3461980B2 (en) 1995-08-25 2003-10-27 株式会社東芝 High-speed drawing method and apparatus
US5926400A (en) 1996-11-21 1999-07-20 Intel Corporation Apparatus and method for determining the intensity of a sound in a virtual world
US6064771A (en) 1997-06-23 2000-05-16 Real-Time Geometry Corp. System and method for asynchronous, adaptive moving picture compression, and decompression
US6072496A (en) 1998-06-08 2000-06-06 Microsoft Corporation Method and system for capturing and representing 3D geometry, color and shading of facial expressions and other animated objects
US6226003B1 (en) 1998-08-11 2001-05-01 Silicon Graphics, Inc. Method for rendering silhouette and true edges of 3-D line drawings with occlusion
US6556199B1 (en) 1999-08-11 2003-04-29 Advanced Research And Technology Institute Method and apparatus for fast voxelization of volumetric models
US6509902B1 (en) 2000-02-28 2003-01-21 Mitsubishi Electric Research Laboratories, Inc. Texture filtering for surface elements
US7522186B2 (en) 2000-03-07 2009-04-21 L-3 Communications Corporation Method and apparatus for providing immersive surveillance
US6968299B1 (en) 2000-04-14 2005-11-22 International Business Machines Corporation Method and apparatus for reconstructing a surface using a ball-pivoting algorithm
US6750873B1 (en) 2000-06-27 2004-06-15 International Business Machines Corporation High quality texture reconstruction from multiple scans
US7538764B2 (en) 2001-01-05 2009-05-26 Interuniversitair Micro-Elektronica Centrum (Imec) System and method to obtain surface structures of multi-dimensional objects, and to represent those surface structures for animation, transmission and display
US6919906B2 (en) 2001-05-08 2005-07-19 Microsoft Corporation Discontinuity edge overdraw
GB2378337B (en) 2001-06-11 2005-04-13 Canon Kk 3D Computer modelling apparatus
US6990681B2 (en) 2001-08-09 2006-01-24 Sony Corporation Enhancing broadcast of an event with synthetic scene using a depth map
US7909696B2 (en) 2001-08-09 2011-03-22 Igt Game interaction in 3-D gaming environments
US6781591B2 (en) 2001-08-15 2004-08-24 Mitsubishi Electric Research Laboratories, Inc. Blending multiple images using local and global information
US7023432B2 (en) 2001-09-24 2006-04-04 Geomagic, Inc. Methods, apparatus and computer program products that reconstruct surfaces from data point sets
US7096428B2 (en) 2001-09-28 2006-08-22 Fuji Xerox Co., Ltd. Systems and methods for providing a spatially indexed panoramic video
KR100861161B1 (en) 2002-02-06 2008-09-30 디지털 프로세스 가부시끼가이샤 Computer-readable record medium for storing a three-dimensional displaying program, three-dimensional displaying device, and three-dimensional displaying method
US20040217956A1 (en) 2002-02-28 2004-11-04 Paul Besl Method and system for processing, compressing, streaming, and interactive rendering of 3D color image data
US7515173B2 (en) 2002-05-23 2009-04-07 Microsoft Corporation Head pose tracking system
US7030875B2 (en) 2002-09-04 2006-04-18 Honda Motor Company Ltd. Environmental reasoning using geometric data structure
US7106358B2 (en) 2002-12-30 2006-09-12 Motorola, Inc. Method, system and apparatus for telepresence communications
US20050017969A1 (en) 2003-05-27 2005-01-27 Pradeep Sen Computer graphics rendering using boundary information
US7480401B2 (en) 2003-06-23 2009-01-20 Siemens Medical Solutions Usa, Inc. Method for local surface smoothing with application to chest wall nodule segmentation in lung CT data
US7321669B2 (en) * 2003-07-10 2008-01-22 Sarnoff Corporation Method and apparatus for refining target position and size estimates using image and depth data
GB2405776B (en) 2003-09-05 2008-04-02 Canon Europa Nv 3d computer surface model generation
US7184052B2 (en) 2004-06-18 2007-02-27 Microsoft Corporation Real-time texture rendering using generalized displacement maps
US7292257B2 (en) 2004-06-28 2007-11-06 Microsoft Corporation Interactive viewpoint video system and process
US7671893B2 (en) 2004-07-27 2010-03-02 Microsoft Corp. System and method for interactive multi-view video
US20060023782A1 (en) 2004-07-27 2006-02-02 Microsoft Corporation System and method for off-line multi-view video compression
US7561620B2 (en) 2004-08-03 2009-07-14 Microsoft Corporation System and process for compressing and decompressing multiple, layered, video streams employing spatial and temporal encoding
US7142209B2 (en) 2004-08-03 2006-11-28 Microsoft Corporation Real-time rendering system and process for interactive viewpoint video that was generated using overlapping images of a scene captured from viewpoints forming a grid
US7221366B2 (en) 2004-08-03 2007-05-22 Microsoft Corporation Real-time rendering system and process for interactive viewpoint video
US8477173B2 (en) 2004-10-15 2013-07-02 Lifesize Communications, Inc. High definition videoconferencing system
DE112005003003T5 (en) 2004-12-10 2007-11-15 Kyoto University System, method and program for compressing three-dimensional image data and recording medium therefor
WO2006084385A1 (en) 2005-02-11 2006-08-17 Macdonald Dettwiler & Associates Inc. 3d imaging system
DE102005023195A1 (en) 2005-05-19 2006-11-23 Siemens Ag Method for expanding the display area of a volume recording of an object area
US8228994B2 (en) 2005-05-20 2012-07-24 Microsoft Corporation Multi-view video coding based on temporal and view decomposition
WO2007005752A2 (en) 2005-07-01 2007-01-11 Dennis Christensen Visual and aural perspective management for enhanced interactive video telepresence
JP4595733B2 (en) 2005-08-02 2010-12-08 カシオ計算機株式会社 Image processing device
US7551232B2 (en) 2005-11-14 2009-06-23 Lsi Corporation Noise adaptive 3D composite noise reduction
KR100810268B1 (en) 2006-04-06 2008-03-06 삼성전자주식회사 Embodiment Method For Color-weakness in Mobile Display Apparatus
US7778491B2 (en) 2006-04-10 2010-08-17 Microsoft Corporation Oblique image stitching
US7679639B2 (en) 2006-04-20 2010-03-16 Cisco Technology, Inc. System and method for enhancing eye gaze in a telepresence system
EP1862969A1 (en) 2006-06-02 2007-12-05 Eidgenössische Technische Hochschule Zürich Method and system for generating a representation of a dynamically changing 3D scene
US20080043024A1 (en) 2006-06-26 2008-02-21 Siemens Corporate Research, Inc. Method for reconstructing an object subject to a cone beam using a graphic processor unit (gpu)
USD610105S1 (en) 2006-07-10 2010-02-16 Cisco Technology, Inc. Telepresence system
US20080095465A1 (en) 2006-10-18 2008-04-24 General Electric Company Image registration system and method
US8213711B2 (en) 2007-04-03 2012-07-03 Her Majesty The Queen In Right Of Canada As Represented By The Minister Of Industry, Through The Communications Research Centre Canada Method and graphical user interface for modifying depth maps
GB0708676D0 (en) 2007-05-04 2007-06-13 Imec Inter Uni Micro Electr A Method for real-time/on-line performing of multi view multimedia applications
US8253770B2 (en) 2007-05-31 2012-08-28 Eastman Kodak Company Residential video communication system
US8063901B2 (en) 2007-06-19 2011-11-22 Siemens Aktiengesellschaft Method and apparatus for efficient client-server visualization of multi-dimensional data
JP4947593B2 (en) 2007-07-31 2012-06-06 Kddi株式会社 Apparatus and program for generating free viewpoint image by local region segmentation
US8223192B2 (en) 2007-10-31 2012-07-17 Technion Research And Development Foundation Ltd. Free viewpoint video
US8466913B2 (en) 2007-11-16 2013-06-18 Sportvision, Inc. User interface for accessing virtual viewpoint animations
CN102016877B (en) * 2008-02-27 2014-12-10 索尼计算机娱乐美国有限责任公司 Methods for capturing depth data of a scene and applying computer actions
US8442355B2 (en) 2008-05-23 2013-05-14 Samsung Electronics Co., Ltd. System and method for generating a multi-dimensional image
US7840638B2 (en) 2008-06-27 2010-11-23 Microsoft Corporation Participant positioning in multimedia conferencing
US8106924B2 (en) 2008-07-31 2012-01-31 Stmicroelectronics S.R.L. Method and system for video rendering, computer program product therefor
WO2010023580A1 (en) 2008-08-29 2010-03-04 Koninklijke Philips Electronics, N.V. Dynamic transfer of three-dimensional image data
US20110169824A1 (en) 2008-09-29 2011-07-14 Nobutoshi Fujinami 3d image processing device and method for reducing noise in 3d image processing device
WO2010037512A1 (en) 2008-10-02 2010-04-08 Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. Intermediate view synthesis and multi-view data signal extraction
US8200041B2 (en) 2008-12-18 2012-06-12 Intel Corporation Hardware accelerated silhouette detection
US8436852B2 (en) 2009-02-09 2013-05-07 Microsoft Corporation Image editing consistent with scene geometry
US8477175B2 (en) 2009-03-09 2013-07-02 Cisco Technology, Inc. System and method for providing three dimensional imaging in a network environment
JP5222205B2 (en) 2009-04-03 2013-06-26 Kddi株式会社 Image processing apparatus, method, and program
US20100259595A1 (en) 2009-04-10 2010-10-14 Nokia Corporation Methods and Apparatuses for Efficient Streaming of Free View Point Video
US8719309B2 (en) 2009-04-14 2014-05-06 Apple Inc. Method and apparatus for media data transmission
US8665259B2 (en) 2009-04-16 2014-03-04 Autodesk, Inc. Multiscale three-dimensional navigation
US8755569B2 (en) 2009-05-29 2014-06-17 University Of Central Florida Research Foundation, Inc. Methods for recognizing pose and action of articulated objects with collection of planes in motion
US8629866B2 (en) 2009-06-18 2014-01-14 International Business Machines Corporation Computer method and apparatus providing interactive control and remote identity through in-world proxy
US9648346B2 (en) 2009-06-25 2017-05-09 Microsoft Technology Licensing, Llc Multi-view video compression and streaming based on viewpoints of remote viewer
KR101070591B1 (en) * 2009-06-25 2011-10-06 (주)실리콘화일 distance measuring apparatus having dual stereo camera
US8194149B2 (en) 2009-06-30 2012-06-05 Cisco Technology, Inc. Infrared-aided depth estimation
US8633940B2 (en) 2009-08-04 2014-01-21 Broadcom Corporation Method and system for texture compression in a system having an AVC decoder and a 3D engine
US8908958B2 (en) 2009-09-03 2014-12-09 Ron Kimmel Devices and methods of generating three dimensional (3D) colored models
US8284237B2 (en) 2009-09-09 2012-10-09 Nokia Corporation Rendering multiview content in a 3D video system
US8441482B2 (en) 2009-09-21 2013-05-14 Caustic Graphics, Inc. Systems and methods for self-intersection avoidance in ray tracing
US20110084983A1 (en) 2009-09-29 2011-04-14 Wavelength & Resonance LLC Systems and Methods for Interaction With a Virtual Environment
US9154730B2 (en) 2009-10-16 2015-10-06 Hewlett-Packard Development Company, L.P. System and method for determining the active talkers in a video conference
US8537200B2 (en) 2009-10-23 2013-09-17 Qualcomm Incorporated Depth map generation techniques for conversion of 2D video data to 3D video data
CN102792699A (en) 2009-11-23 2012-11-21 通用仪表公司 Depth coding as an additional channel to video sequence
US8487977B2 (en) 2010-01-26 2013-07-16 Polycom, Inc. Method and apparatus to virtualize people with 3D effect into a remote room on a telepresence call for true in person experience
US20110211749A1 (en) 2010-02-28 2011-09-01 Kar Han Tan System And Method For Processing Video Using Depth Sensor Information
US8898567B2 (en) 2010-04-09 2014-11-25 Nokia Corporation Method and apparatus for generating a virtual interactive workspace
EP2383696A1 (en) 2010-04-30 2011-11-02 LiberoVision AG Method for estimating a pose of an articulated object model
US20110304619A1 (en) 2010-06-10 2011-12-15 Autodesk, Inc. Primitive quadric surface extraction from unorganized point cloud data
US8411126B2 (en) 2010-06-24 2013-04-02 Hewlett-Packard Development Company, L.P. Methods and systems for close proximity spatial audio rendering
KR20120011653A (en) * 2010-07-29 2012-02-08 삼성전자주식회사 Image processing apparatus and method
US8659597B2 (en) 2010-09-27 2014-02-25 Intel Corporation Multi-view ray tracing using edge detection and shader reuse
US8787459B2 (en) 2010-11-09 2014-07-22 Sony Computer Entertainment Inc. Video coding methods and apparatus
US9123115B2 (en) * 2010-11-23 2015-09-01 Qualcomm Incorporated Depth estimation based on global motion and optical flow
JP5858381B2 (en) * 2010-12-03 2016-02-10 国立大学法人名古屋大学 Multi-viewpoint image composition method and multi-viewpoint image composition system
US8693713B2 (en) 2010-12-17 2014-04-08 Microsoft Corporation Virtual audio environment for multidimensional conferencing
US8156239B1 (en) 2011-03-09 2012-04-10 Metropcs Wireless, Inc. Adaptive multimedia renderer
EP2707834B1 (en) 2011-05-13 2020-06-24 Vizrt Ag Silhouette-based pose estimation
US8867886B2 (en) 2011-08-08 2014-10-21 Roy Feinson Surround video playback
CN103828359B (en) 2011-09-29 2016-06-22 杜比实验室特许公司 For producing the method for the view of scene, coding system and solving code system
US9830743B2 (en) 2012-04-03 2017-11-28 Autodesk, Inc. Volume-preserving smoothing brush
US9058706B2 (en) 2012-04-30 2015-06-16 Convoy Technologies Llc Motor vehicle camera and monitoring system

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5602903A (en) * 1994-09-28 1997-02-11 Us West Technologies, Inc. Positioning system and method
US6163337A (en) * 1996-04-05 2000-12-19 Matsushita Electric Industrial Co., Ltd. Multi-view point image transmission method and multi-view point image display method
US20070153000A1 (en) * 2005-11-29 2007-07-05 Siemens Corporate Research Inc Method and Apparatus for Discrete Mesh Filleting and Rounding Through Ball Pivoting
US20070124015A1 (en) * 2005-11-30 2007-05-31 Tian Chen System and method for extracting parameters of a cutting tool
US20090262977A1 (en) * 2008-04-18 2009-10-22 Cheng-Ming Huang Visual tracking system and method thereof
US20110282473A1 (en) * 2008-04-30 2011-11-17 Otismed Corporation System and method for image segmentation in generating computer models of a joint to undergo arthroplasty

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
Mike Roberts, Jeff Packer, Mario Costa Sousa, Joseph Ross Mitchell, ,"A Work-Efficient GPU Algorithm for Level Set Segmentation", High Performance Graphics (2010), June 25-27, 2010, a video published at https://vimeo.com/24167006. *

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180176533A1 (en) * 2015-06-11 2018-06-21 Conti Temic Microelectronic Gmbh Method for generating a virtual image of vehicle surroundings
US10412359B2 (en) * 2015-06-11 2019-09-10 Conti Temic Microelectronic Gmbh Method for generating a virtual image of vehicle surroundings
US10681337B2 (en) * 2017-04-14 2020-06-09 Fujitsu Limited Method, apparatus, and non-transitory computer-readable storage medium for view point selection assistance in free viewpoint video generation
US20210409669A1 (en) * 2018-11-21 2021-12-30 Boe Technology Group Co., Ltd. A method for generating and displaying panorama images based on rendering engine and a display apparatus
US11589026B2 (en) * 2018-11-21 2023-02-21 Beijing Boe Optoelectronics Technology Co., Ltd. Method for generating and displaying panorama images based on rendering engine and a display apparatus
WO2021063271A1 (en) * 2019-09-30 2021-04-08 Oppo广东移动通信有限公司 Human body model reconstruction method and reconstruction system, and storage medium
US11928778B2 (en) 2019-09-30 2024-03-12 Guangdong Oppo Mobile Telecommunications Corp., Ltd. Method for human body model reconstruction and reconstruction system

Also Published As

Publication number Publication date
US9256980B2 (en) 2016-02-09
US20130321589A1 (en) 2013-12-05
US8917270B2 (en) 2014-12-23
US20130321593A1 (en) 2013-12-05
US9846960B2 (en) 2017-12-19
US20130321418A1 (en) 2013-12-05
US9251623B2 (en) 2016-02-02
US20130321566A1 (en) 2013-12-05
US20130321396A1 (en) 2013-12-05
US20130321586A1 (en) 2013-12-05
US20130321590A1 (en) 2013-12-05
US20130321410A1 (en) 2013-12-05
US20130321575A1 (en) 2013-12-05

Similar Documents

Publication Publication Date Title
US20130321413A1 (en) Video generation using convict hulls
US10304244B2 (en) Motion capture and character synthesis
US7310101B2 (en) System and method for generating generalized displacement maps from mesostructure geometries
US11398059B2 (en) Processing 3D video content
US9704282B1 (en) Texture blending between view-dependent texture and base texture in a geographic information system
KR20080051158A (en) Photographing big things
CN102722861A (en) CPU-based graphic rendering engine and realization method
US10638151B2 (en) Video encoding methods and systems for color and depth data representative of a virtual reality scene
US9965893B2 (en) Curvature-driven normal interpolation for shading applications
US20240020915A1 (en) Generative model for 3d face synthesis with hdri relighting
US9401044B1 (en) Method for conformal visualization
Zhang et al. An efficient dynamic volume rendering for large-scale meteorological data in a virtual globe
US11704839B2 (en) Multiview video encoding and decoding method
CN114820980A (en) Three-dimensional reconstruction method and device, electronic equipment and readable storage medium
Usher et al. In situ exploration of particle simulations with CPU ray tracing
CN108921908B (en) Surface light field acquisition method and device and electronic equipment
US11954802B2 (en) Method and system for generating polygon meshes approximating surfaces using iteration for mesh vertex positions
US20230394767A1 (en) Method and system for generating polygon meshes approximating surfaces using root-finding and iteration for mesh vertex positions
US11961186B2 (en) Method and system for visually seamless grafting of volumetric data
EP4287134A1 (en) Method and system for generating polygon meshes approximating surfaces using root-finding and iteration for mesh vertex positions
Hao et al. Image completion with perspective constraint based on a single image
US20240005605A1 (en) Method and system for visually seamless grafting of volumetric data
Chen et al. A quality controllable multi-view object reconstruction method for 3D imaging systems
Jiang et al. A large-scale scene display system based on webgl
Hall et al. Networked and multimodal 3d modeling of cities for collaborative virtual environments

Legal Events

Date Code Title Description
AS Assignment

Owner name: MICROSOFT CORPORATION, WASHINGTON

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:SWEENEY, PATRICK;GILLETT, DON;REEL/FRAME:029960/0091

Effective date: 20130228

AS Assignment

Owner name: MICROSOFT TECHNOLOGY LICENSING, LLC, WASHINGTON

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:MICROSOFT CORPORATION;REEL/FRAME:034544/0541

Effective date: 20141014

STCB Information on status: application discontinuation

Free format text: ABANDONED -- FAILURE TO RESPOND TO AN OFFICE ACTION