WO2015051104A1

WO2015051104A1 - Classification and regression for radio frequency localization

Info

Publication number: WO2015051104A1
Application number: PCT/US2014/058802
Authority: WO
Inventors: Benjamin BALAGUER; Gorkem ERINC; Stefano CARPIN
Original assignee: The Regents Of The University Of California
Priority date: 2013-10-04
Filing date: 2014-10-02
Publication date: 2015-04-09

Abstract

In various exemplary embodiments, a system and related method of determining localization of a robot in an unknown environment using radio frequency signals is disclosed. In one embodiment, the method comprises collecting a plurality of geotagged WiFi measurements, training a random forest, and developing a WiFi signature map from the trained random forest.

Description

CLASSIFICATION AND REGRESSION FOR

RADIO FREQUENCY LOCALIZATION

RELATED APPLICATION

[0001] This application claims priority to United States Provisional Patent Application Number 61/886,935, entitled "CLASSIFICATION AND

REGRESSION FOR RADIO FREQUENCY LOCALIZATION," filed on October 4, 2013, which is incorporated herein by reference in its entirety.

TECHNICAL FIELD

[0002] The present application relates generally to the field of computer technology and, in a specific exemplary embodiment, to a system and method of determining localization of a robot in an unknown environment using WiFi signals.

BACKGROUND

[0003] The utilization of wireless signals transmitted from access points

(APs) or other signal generators, such as home-made sensors, has enjoyed great interest from the robotics community, in particular for its applicability to the localization (e.g., positioning of a person or object) problem. In this area, the application of data-driven methods has prevailed thanks to two distinct but popular approaches. Under one approach, modeling attempts to understand, through collected data, how the wireless signal propagates under different conditions. A goal of the modeling approach is to generate a wireless signal model that can then be utilized for localization. Under the second approach, mapping uses collected data directly by combining spatial coordinates with wireless signal strengths to create maps from which a robot can localize.

However, wireless signals may be distorted due to effects of wave phenomena such as diffraction, scattering, reflection, and absorption. Consequently, practical implementations of wireless signal modeling have not previously been available for unknown environments since the models need to be trained in similar conditions to what will be encountered. That is, the models require at least some a priori information about the environment. [0004] A variety of methods have been devised to solve this problem. Two of the earliest solutions used nearest neighbor searches and histograms of signal strengths for each. Starting from the observation that histograms of signal strengths are generally normally distributed, various other methods have recently been proposed using Gaussian distributions to account for the inevitable variance in received signal strengths. Specifically, for each location and each access points, a Gaussian distribution is derived from training data. An unknown location described by a new set of observed signal strengths is then determined using Bayesian filtering. These methods exploit the inherently available spatial and temporal information of a moving robot through a Hidden Markov Model (HMM), which, unlike the HMM, requires additional sensory feedback (e.g., odometry). Another approach exploits Support Vector Machines by formulating the problem as a classification instance.

[0005] However, the Gaussian processes require a long parameter optimization step making the process difficult to use in real-time operations with limited computational power, as may be found in mobile devices such as smartphones and tablets.

BRIEF DESCRIPTION OF DRAWINGS

[0006] Various ones of the appended drawings merely illustrate exemplary embodiments of the subject matter presented herein. Therefore, the appended drawings are provided to allow a person of ordinary skill in the art to better understand the concepts disclosed herein, and therefore cannot be considered as limiting a scope of the disclosed subject matter.

[0007] FIG. 1 shows an exemplary locational WiFi map indicating wireless signal strength data collected from various access points;

[0008] FIG. 2 shows an exemplary locational WiFi map indicating wireless signal strength data collected from various access points in a residential outdoor environment;

[0009] FIG. 3 shows an exemplary classification error graph indicating the average classification error (in meters) versus the number of readings per location for a number of algorithms used in a determination of producing locations for a WiFi map;

[0010] FIG. 4 shows a cumulative probability graph for classifying a location within the error margin (in meters) indicated on the x-axis;

[0011] FIG. 5 shows an average training time graph is shown comparing training times for several of the algorithms discussed herein;

[0012] FIG. 6 shows average errors for each algorithm and each dataset collected and processed at two universities;

[0013] FIG. 7 shows an example of the highest Random Forest vote;

[0014] FIG. 8 shows an average regression error graph for the average accuracy (from the 10 runs and 50 random samples used for training) of the four best classification algorithms compared against the disclosed regression algorithm;

[0015] FIG. 9 shows a localized run showing the output from a laser range finder simultaneous localization and mapping (LRF SLAM), a WiFi localizer employing a Monte Carlo localization (MCL), and the ground truth; [0016] FIG. 10 shows three non-overlapping localized outdoor runs showing the output of the WiFi localizer with MCL compared with GPS points;

[0017] FIG. 11 shows an exemplary mapping flowchart of a generalized method for collecting WiFi measurements and developing a WiFi map; [0018] FIG. 12 shows an exemplary localization flowchart of a more detailed method for collecting WiFi measurements and developing a WiFi map; and

[0019] FIG. 13 shows a simplified block diagram of a machine in an exemplary form of a computing system within which a set of instructions, for causing the machine to perform any one or more of the methodologies discussed herein, may be executed.

DETAILED DESCRIPTION

[0020] The description that follows includes illustrative systems, methods, techniques, instruction sequences, and computing machine program products that embody various aspects of the subject matter described herein. In the following description, for purposes of explanation, numerous specific details are set forth to provide an understanding of various embodiments of the subject matter. It will be evident, however, to those skilled in the art that embodiments of the subject matter may be practiced without these specific details. Further, well-known instruction instances, protocols, structures, and techniques have not been shown in detail.

[0021] As used herein, the term "or" may be construed in either an inclusive or exclusive sense. Similarly, the term "exemplary" is construed merely to mean an example of something, or an exemplar, and not necessarily a preferred or ideal means of accomplishing a goal. Additionally, although various exemplary embodiments discussed below focus on particular data collection techniques, algorithms, and methods, the embodiments are given merely for clarity in disclosure. Thus, various types of data collection techniques, algorithms, and methods are considered as being within a scope of the subject matter described. [0022] A novel algorithmic process disclosed herein may be turned into applications (e.g., software, firmware, or hardware), and allows WiFi-capable devices, such as mobile phones and laptops, to localize inside buildings. As noted above, other sensor types, such as global positioning sensors (GPS) cannot localize inside buildings or similar environments. The disclosed process can rely exclusively on a WiFi-capable device. The novel algorithmic process then provides indoor and outdoor localization and mapping using wireless signals. People and other objects can be localized in various environments, either indoors or outdoors, by creating a "map" based on collected WiFi signals. The mapping may be performed and produced to sub-meter accuracy using the wireless WiFi signals. Additionally, the same or similar disclosed techniques and processes may be used to track movable objects, such as animals.

[0023] A problem involving localization and mapping is solved using team- based robot mapping and localization using wireless signals broadcast from access points commonly found in contemporary urban environments. The techniques and methods discussed herein employ mapping and localizing in an unknown environment, where the locations of access points are unspecified and for which training data are a priori unavailable.

[0024] The generalized approach is based on a heterogeneous method of combining robots with different sensor payloads. The algorithmic design assumes an ability to produce a map in real-time from a sensor-full robot that can quickly be shared by sensor-deprived robot team members. More specifically, the WiFi localization problem is considered as a classification and regression problem that is subsequently solved using machine learning techniques.

[0025] In order to produce a robust system, spatial and temporal information inherent in robot motion is considered by running Monte Carlo Localization (MCL) on top of the regression algorithm discussed herein to greatly improve effectiveness of the algorithm. Further, a significant amount of experiments were performed and presented to prove the accuracy, effectiveness, and practicality of the algorithm. [0026] A tradeoff naturally arises when using low-cost multi-robot teams versus a single (and expensive) robot with an extensive number of sensors. Reducing the number of sensors within a robot decreases the price of the individual robots. However, reducing the number of sensors on a robot also makes the localization problem more challenging.

[0027] Generally, these low-cost robots are not practical in unknown environments due to their lack of perception abilities. However, team-based low-cost robots can exchange information amongst themselves utilizing WiFi technology. Further, the low-cost robots can supply rough estimations of local movements via odometry or similar inexpensive low-accuracy sensors. A heterogeneous setup can pair a number of low-cost robots with a single expensive robot capable of mapping an environment by traditional means (e.g., simultaneous localization and mapping (SLAM) using a laser range finder (LRF) or other sophisticated proximity sensors). Within this scenario, as described herein, a map of an unknown environment may be produced in real-time using the single more capable robot, so that the low-cost robots can localize themselves.

[0028] As discussed herein, wireless signals from access points are employed since the low-cost robots have increased sensory constraints.

However, the wireless signals can be notoriously difficult to work with due to nonlinear fading, obstructions, and other multipath effects as noted above. Nonetheless, a localizer based on WiFi signals can offer certain advantages. For example, WiFi access points are uniquely identifiable, can be used indoor or outdoor, and, as discussed above, are already part of most urban unknown environments to be mapped. Additionally, the wirelessly-transmitted signals are available to anything within range. The transmitted signals therefore allow robots to exploit access points without having to actually connect to the access points.

[0029] In various embodiments disclosed herein, the problem of WiFi localization may be considered as a machine learning problem, which may be initially solved using classification and regression theory. Utilizing a Monte Carlo Localization (MCL) operation, described in more detail below, the localizer may be improved by exploiting spatial and temporal information inherently encoded within odometry functions or motion sensors of even low- cost robots. The odometry functions can estimate a change in position as a function of time. [0030] As further discussed herein, various algorithms are implemented, considered, and contrasted. The novel algorithm used for localization adds odometry for robustness and is implemented as an MCL operation. The applicability of the algorithm is considered for both unknown environments and real-time performance. [0031] Referring now to FIG. 1 , an exemplary locational WiFi map 100 indicating wireless signal strength data collected from various access points is shown. Several structural areas 110, or obstructions, are also shown in the locational WiFi map 100 of FIG. 1. The several structural areas 110 may be indicative of many types of obstructions including, for example, buildings, walls within a building, or other relatively inaccessible locations. In various embodiments, the structural areas 110 may simply indicate positions in which no WiFi readings were taken, even though the positions may have readily been accessible. Therefore, a person of ordinary skill in the art will understand the structural areas 110 may indicate positions in which readings were not taken, whether or not any physical obstruction was actually present.

[0032] The locational WiFi map 100 is shown to include an x-distance locator 101 (distance given in meters, along the abscissa) and a y-distance locator 103 (also given in meters, along the ordinate axis). A number of mapping locations, along with a relative signal strength of the WiFi signal collected at that location, are shown. For example, a first mapping location 105 having a first relative signal strength value 107 and a second mapping location 109 having a second relative signal strength value 111 are given by way of example. A person of ordinary skill in the art will recognize that the relative signal strength values are simple provided as arbitrary units simply to provide context of the mapping activity. However, a larger outside circle is indicative of a larger signal strength than a smaller outside circle. Also, the number of circles, that is, a "circle pattern," for each mapping location is indicative of a particular access point. For example, the first mapping location 105 and the second mapping location 109 are mapping locations from different access points since the circle patterns are different for each. However, a third mapping location 113, a fourth mapping location 115, and a fifth mapping location 117 are all based on signals received from the same access point since each shares the same circle pattern. Also, in this example, each of the third 113, fourth 115, and fifth 117 mapping locations have similar relative signal strengths since the outer circle of each is approximately the same size.

[0033] During a training phase of a mapping approach, wireless signal mapping techniques tie a spatial coordinate with a set of observed signal strengths from each of the different access points. The wireless signal mapping is then used to create a signal map. Given a new set of observed signal strengths acquired at an unknown coordinate, the goal is to use the map to retrieve the correct spatial coordinate as described in more detail below. S YSTEM SETUP AND PR OBLEM DEFINITION

[0034] For example, to develop the locational WiFi map 100 of FIG. 1 , a team of r robots was employed. The team of robots included a "sensor-full" robot, described as the mapper robot. The mapper robot was capable of building a map by traditional means (e.g., simultaneous localization and mapping (SLAM) or a global positioning system (GPS)). The remaining r— 1 robots were low-cost robots, described as the localizer robots. The localizer robots only sensors were odometry functions and a WiFi card. One goal was to develop a system where the mapper robot created the locational WiFi map 100 that the localizer robots could use to localize strictly utilizing their limited sensors. [0035] For experimental purposes, a MobileRobots P3AT (available from Adept Mobilerobots LLC of Amherst, New Hampshire, USA) equipped with an LMS200 Laser Range Finder (LRF) as the mapper robot and various iRobot Create (available from iRobot of Bedford, Massachusetts, USA) platformed as the localizer robots. The mapper robot uses GMapping (available online from OpenSLAM, licenced under the Creative Commons) to solve the SLAM problem. All of the software code described herein was capable of executing on a typical consumer laptop, without relying on multiple cores, graphics processing units, or extensive memory.

[0036] The approach was split into a mapping phase, involving the mapper robot, and a localization phase, involving the localizer robots. During the mapping phase, the mapper robot periodically collected WiFi signal strengths from all the access points in range and associated them with the current Cartesian coordinates provided by GMapping. In terms of notation, for every sample Cartesian position £_v = ¾; γ_ν], an observation vector z_g = [ij, zj, was acquired using a WiFi card, where was the total number of access points seen throughout the environment. Each signal strength s£ was measured in units of dBm, the most commonly provided measurement from hardware WiFi cards. The measurements typically ranged from about -90 dBm to about -10 dBm with lower values (e.g., -90 dBm) indicating lower signal strengths. Since the access points in the environment could not all be sensed from a single location, the observation vector, z₃, was dynamically increased as the robot moved in the environment and identified previously unseen access points. Additionally, since some access points could not be seen from certain locations, the largest value in the observation vector was set as = -100 for any access point c that could not be seen from location p. In order to obtain an indication of the signal strength noise and increase the robustness of the algorithm, multiple observations at each location were collected, resulting in a vector of observations for each location, Z_? = , ϊ '] where ijsji is the total number of observations performed at each location p. The entirety of the data used to build a complete WiFi map in real-time, acquired by the mapper robot, could then be represented as a matrix T of locations and observations;

where \p\ is the total number of locations for which WiFi signals were acquired.

Therefore, the matrix Γ is comprised of jjsjj observations for each of the \p locations and each of the \a\ access points in the locational WiFi map 100 of FIG. 1, where each observation is labeled with the associated Cartesian coordinates acquired from GMapping, c_a.

[0037] Upon viewing the observation vectors that comprise the matrix, a person of ordinary skill in the art will recognize that two parameters can increase the accuracy of the mapping: the number of locations, ipij, to consider for the WiFi map, and the number of observations, N, performed at each of the locations. The total number of access points, the parameter H, is dictated by the structure of the unknown environment. The number of locations, is proportional to a size of the mapped environment and is at least partially dependent on the density of the measurements.

[0038] For example, empirical evidence has shown that recording observations at approximately every one meter yielded accurate maps, but no restrictions are placed on the alignment of the locations (i.e., they do not need to be grid-aligned or follow any particular structure). The number of locations parameter jjsjj encompasses a possible tradeoff that has been analyzed. For example, the higher the number of observations performed at each location (a larger H), the better the noise model but the longer period of time needed to complete the mapping process.

[0039] Additional empirical evidence has also shown that signal strength readings are consistent across different robots having similar hardware. The consistency in signal strength readings implies that WiFi maps may be shared amongst robots. Also, signal strength measurements taken on different days or at different times were, although slightly different, not significantly so, and indicate that WiFi maps acquired at a certain time can still be effective for robots operating in the same environment at a later time.

Classification

[0040] The locational WiFi map 100 of FIG. 1 provides a good qualitative indication that it is possible to differentiate between different locations by only considering signal strengths received from the access points. The WiFi localizer is developed by casting the localization problem as a machine learning classification problem. [0041] Mathematically, a function f s I -? p is produced from the training data Γ acquired by the mapper robot. The function takes a new observation 2 = |z^£, ». ,₂ ] acquired by one of the localizer robots and returned the location, p, of the robot, from which the Cartesian coordinate, _s could be acquired readily. Casting the problem in such a manner is a simplification, since a robot needs to be positioned exactly at one of the locations in the training data in order for / to return the exact coordinate _B (i.e., classification does not perform interpolation or regression). Solving the classification problem allowed the effectiveness of various algorithms to be evaluated and applied to build a better localizer, as will be further discussed, below.

[0042] Computing the function from the matrix T can be achieved using different techniques. A total of six algorithms were considered, three of which have been published in the WiFi localization literature (the Gaussian model, the Support Vector Machine model, and the Nearest Neighbor Search model) and three others that have enjoyed popularity in machine learning tasks (the Decision Tree model, the Random Forest model, and the Multinomial Logit model). Each algorithm is discussed briefly below in the context of the definition of the problem as stated above.

Algorithms Considered

[0043] Decision Tree: A decision tree is a binary tree constructed automatically from the training data as provided in the matrix T, shown above. Each node of the tree corresponds to a decision made on one of the input parameters, z , that divides the node's data into two new subsets, one for each of the node's sub-trees, in such a way that the same target variables, p, are in the same subsets. The process is iterated in a top-down structure, working from the root (whose data subset is T) down to the leaves, and may stop when each node's data subset contains one and only one target variable or when adding new nodes becomes ineffective.

[0044] Various formulas have been proposed to compute the "best" partitioning of a node's data subset, the most popular of which being the Gini coefficient, the twoing rule, and the information gain. The formulas are known independently in the art. After experimental assessments, the choice of partitioning criterion was found to be insignificant in terms of localization accuracy. Therefore, the Gini coefficient was applied herein although other formulas may be applied as well. In order to classify a new observation Z, the appropriate parameter ε~ compared at each node, the decision of which dictates which branch is taken - a step that may be repeated until a leaf is reached. The target variable p at the leaf is the location of the robot.

[0045] Random Forest: Random Forests are an ensemble of decision trees built to reduce over-fitting behaviors often observed in single decision trees. The ensemble of decision trees is achieved by creating a set of Idl trees, as described in the previous paragraphs, each with a different starting dataset, ¾ , selected randomly with replacement from the full training data T. An additional difference comes from the node splitting process, which is performed by considering a random set of q input parameters as opposed to all of the input parameters. The partitioning criterion is still used, but only the |q randomly selected parameters are considered. In order to classify a new observation, Z, the observation is processed by each of the |di trees, resulting in lia^ output variables, some of which may be the same.

[0046] In some sense, each decision tree in the forest votes for an output variable and the one with the most votes is chosen as the location of the robot. It is important to note that a lot of information can be extracted from this voting scheme, since the votes can trivially be converted to the robot' s probability of being at a particular location, Pip 12} = Y₉f{d\, where ¾ is the total number of votes received for location p with Cartesian coordinate c_s. The number of trees ^ encompasses a tradeoff between speed and accuracy. In an exemplary embodiment, the number of trees was set !d!i = 30 after determining that it yielded the highest ratio of accuracy to the number of trees.

[0047] Gaussian Model: The Gaussian model technique attempts to model the inherent noise of wireless signal strength readings through a Gaussian distribution. For each location p and for each of the access points, the mean and standard deviation of the signal strength readings are computed, yielding and ¾% respectively. Consequently, a total of p\ x e Gaussian distributions are calculated, where all and αξ are computed from ^ values (i.e., the total number of observations performed at each location). The location, p, of a new observation, I, is derived using the probability density function of the Gaussian:

[0048] Support Vector Machine: Support vector machines work by constructing a set of hyperplanes in such a way that the hyperplanes perfectly divide two data classes (i.e., the hyperplanes perform binary classification). Generating the hyperplanes is an optimization problem that maximizes the distance between the hyperplanes and the nearest training point of either class. Although initially designed as a linear classifier, the usage of kernels (e.g., polynomial, Gaussian, etc.) enables non-linear classification. Since the training data are divided into classes, a support vector machine variant can be utilized that works for multiple classes. For example, in various embodiments, an one- versus- all approach was employed as it was empirically determined it was better than one-versus-one. The one-versus-all approach comprises building ^ support vector machines that each try to separate one class -p from the rest of the other classes. This creates a set of ip\ binary classification problems, each of which is solved using the standard support vector machine algorithm with a Gaussian kernel. Once all the support vector machines are trained, a new observation 1 can be localized by evaluating its high-dimensional position with respect to the hyperplanes for each support vector machine ψ. The class φ is chosen by the support vector machine that classifies I with the greatest distance from its hyperplanes.

[0049] Nearest Neighbor Search: The nearest neighbor search does not require any processing of the training data, X. Instead, the distance between all the points in τ and a new observation 7. is computed, yielding x H distances I?* . The location p of the new observation is then selected with the formula: 9-S2Rh-„(i>*). The only necessary decision for this algorithm involves the choice of one of the numerous proposed distance formulas, known independently in the art. For example, the Euclidean distance formula may be used. [0050] Multinomial Logit: Multinomial logit is an extension of logistic regression allowing multiple classes to be considered. Logistic regression separates two classes linearly by fitting a binomial distribution to the training data (e.g., one class is set as a "success" and the other class as a "failure"). This generalized linear model technique produces a set of Ipl ÷ 1 regression coefficients 3- that are used to calculate the probability of a new observation being a "success." Conversely to the multi-class support vector machine, multinomial logit follows a one-against-one methodology where one class p₇._sf is trained against each of the remaining classes. Consequently, Ipl - 1 logistic regressions are trained, yielding Ipl— 1 sets of regression coefficients, sf. Once all the regression coefficients are learned, the probability of being at location p given a new observation Z can be calculated as follows:

where y? = ®φ Ζ .βί) when p≠ ¾,_{e f} and ? = 1 when = j¾._ff . The location ψ can then be selected as the maximum probability by utilizing the formula asgaiaX_j,?^ i 1) ■

[0051] Referring now to FIG. 2, an exemplary locational WiFi map 200 indicating wireless signal strength data collected from various access points in a residential outdoor environment is shown. The locational WiFi map 200 may be similar to the locational WiFi map 100 of FIG. 1 in several aspects but is provided merely to better describe another environment in which WiFi readings may be taken.

[0052] In FIG. 2, several residential structures 210 are shown. The residential structures 210 may include houses, garages, and other physical structures commonly known to exist near a domicile or structures within a curtilage. Similar to FIG. 1 , the locational WiFi map 200 is shown to include an x-distance locator 201 and a y-distance locator 203. A number of mapping locations, along with a relative signal strength of the WiFi signal collected at that location, are shown. For example, a first mapping location 205 having a first relative signal strength value 207 and a second mapping location 209 having a second relative signal strength value 211 are given by way of example. As with FIG. 1, a larger outside circle is indicative of a larger signal strength than a smaller outside circle. Also, the number of circles, that is, a "circle pattern," for each mapping location is indicative of a particular access point.

Results of the Classification Problem

[0053] The experimental results of the WiFi localizer cast as a classification problem may influence algorithmic design choices for the end-to-end algorithm. Therefore, the results are presented before continuing with a description of the algorithm. Specifically, a large indoor training dataset was gathered at a university, stopping the robot at pre-determined locations approximately 1 meter apart. However, in other embodiments, the robots may move continuously while gathering data so stopping is not necessary. The entire dataset comprised 156 locations = 156), 20 signal strength readings for each location ( = 20), and a total of 48 unique access points (jel = 48). The results shown in this and subsequent sections were performed by sub-sampling the entire dataset.

Specifically, was varied from 1 to 19, essentially representing a percentage of the dataset being used for training, and the remaining data were used for classification. This procedure effectively mimics inevitable differences between data acquired by the mapper robot and the localizer robots. Moreover, the training and classification data were randomly sampled 50 different times for each experiment in order to remove any potential bias from a single sample. Presented results are averages of the 50 samples. Error bars are omitted the various graphs presented herein since the standard deviation of the results were all similar and insignificant. However, for this embodiment, the process just defined does not represent a real- world scenario since it assumes the mapper robot and the localizer robots followed exactly the same path. It does provide, however, a very good platform for comparing the six presented classification algorithms presented herein.

[0054] With reference now to FIG. 3, an exemplary classification error graph 300 is shown. The classification error graph 300 indicates the average classification error (in meters) versus the number of readings per location for a number of algorithms used in a determination of producing locations for a WiFi map, such as the exemplary locational WiFi map 100 of FIG. I, discussed above. The classification error graph 300 shows the classification error for algorithms including a decision tree model 301, a Random Forest model 303, a Gaussian model 305, a support vector machine model 307, a nearest neighbor search model 309, and a multinominal logit model 311.

[0055] The accuracy of each algorithm was determined as a function of the number of readings taken at each location, ! , as shown in the classification error graph 300 of FIG. 3. Since one goal of the determination was to ultimately be able to map unknown environments substantially in real time, the initial data acquisition performed by the mapper robot should be efficient. For example, in the exemplary test embodiment described above, signal strength readings, on average, took 411 ms to obtain signal strength readings from all the access points in range. Consequently, to save time, Isi can be minimized.

[0056] As expected, the classification error graph 300 shows a tradeoff between the localization accuracy of the algorithm and the number of readings used during the training phase. Generally, since it may be desirable to limit the time it takes to gather the training data, setting the number of readings per location to 3 Qsl = 3) provides a good compromise between speed and accuracy, especially since the classification error graph 300 shows a horizontal asymptote starting at or around the three-reading point for the best of the algorithms.

Consequently, the remaining results presented herein were performed using three readings per location.

[0057] On average, when I = 3, the classification error of the Random Forest is 43.42%, 55.47%, and 57.23% better than previously published WiFi localization algorithms of the Gaussian Model 305, the Support Vector Machine model 307, and the Nearest Neighbor Search model 309, respectively. [0058] FIG. 4 shows a cumulative probability graph 400 for classifying a location within the error margin (in meters) indicated on the x-axis. FIG. 4 corroborates the findings of the classification error graph 300 of FIG. 3, showing that the best algorithm is the Random Forest model 303, which had never been exploited in the context of WiFi localization. Moreover, the Random Forest model 303 can localize a new observation, Z, to the exact location (i.e., zero margin of error) 88.58% of the time. [0059] With reference now to FIG. 5, an average training time graph 500 is shown comparing training times for several of the algorithms discussed herein. The average training time graph 500 also includes an online Random Forest model 501 since one of the goals focuses on online WiFi map building.

Therefore, the average training time graph 500 considers the time that each of the four algorithms takes to incrementally add a new cluster into a WiFi map.

[0060] The number of clusters was progressively increased from 20 to 100 as the robot explored the map. A skilled artisan will appreciate that the average training time graph 500 of FIG. 5 is provided to analyze trends since 100 clusters accounts for a small map covering an area ranging from 100 to 300 meters. FIG. 5 indicates that both the Random Forest model 303 and the Support Vector Machine model 307 both take more time to train, especially since they grow linearly and exponentially as the number of clusters increases. The Gaussian model 305 and Online Random Forest model 501 (which nearly overlay one another on the graph) add new clusters to a trained map in constant time, with speeds of 27.4 ms and 2.5 ms, respectively. Test results also indicated that the online Random Forest model 501 produced results only marginally inferior to the offline Random Forest model 303.

[0061] With regard to the datasets discussed above with reference to FIG. 3 and FIG. 4, a publicly available dataset covering three floors of the computer science building at Rice University (Houston, Texas, USA) was considered. Referring to the comparison graph 600 of FIG. 6, the Rice University dataset was processed in the exact same manner as the dataset taken, as described above, at the University of California ("UC" at Merced, California, USA). The data were divided in 50 random samples of training and classification data for id = 3. The average error for each algorithm and each dataset is shown in FIG. 6.

[0062] The comparison graph 600 indicates interesting observations that can be gleaned from these results. First, the trend of the algorithm comparison is the same for both datasets. In other words, the list of the algorithms from best to worst average accuracy (i.e., Random Forest model 303, Gaussian model 305, Support Vector Machine model 307, Nearest Neighbor Search model 309, Decision Tree model 301, and Multinomial Logit model 311) was the same regardless of which dataset was used. Second, the Rice University dataset used signal-to-interference ratios for their observations as opposed to signal strengths. Although both measures are loosely related, the graph indicates that the classification algorithms are both general and robust. These findings indicate that the methods proposed are both data- and environment-independent and that the results from the UC-Merced data were not over-fitted. It should be noted that the overall higher average classification error observed in the Rice dataset comes from sparser location samples: 3.33 meters on average as opposed to 0.91 meters for the UC-Merced dataset. [0063] From these results, when recording three readings per location, with one exception, all of the algorithms can be trained in less than 30 seconds. The Support Vector Machine model 307 took 90 seconds. In addition, the training process can be parallelized for all the models, adding a potential decrease in the computational time for producing the WiFi map. Put differently, once the mapper robot has acquired its training data, the map can be built in less than 30 seconds, which is very fast relative to the time it takes to explore the environment. Once the WiFi map is created and given to the localizer robots, they can, in turn, localize in less than 200 ms. Consequently, the presented models are effective, from a computational standpoint, even in unknown environments where map building needs to be performed in real-time.

Regression

[0064] Although the results of the classification algorithms are encouraging, they do not depict a real world scenario, where locations explored by the mapper robot, upon which the WiFi map will be created, may be different than those explored by the localizer robots. Consequently, the WiFi localizer may be recast as a regression problem, where some inference is performed to generate Cartesian coordinates for locations that are not encompassed in the training data. Although typical off-the-shelf regression algorithms (e.g., Neural Network, Radial Basis Functions, Support Vector Regression, etc.) may seem like a good choice initially, these regression algorithms may get corrupted by the nature of the training data. The majority of the training data depicts unseen access points

(i.e., zj = -loo). In addition, it may be wise to exploit the good results exhibited by the classification algorithms. Consequently, a novel regression algorithm was designed and is described herein that builds upon the classification algorithm that produced the best results from the data studied: the Random Forest model 303. In addition to providing the best classification results, the Random Forest model 303 is appealing due to its voting scheme, which can be interpreted as F p ϋ E), for each p e Γ.

[0065] The novel regression algorithm is based on a Gaussian Mixture Model (GMM) and is described by Algorithm 1 , below. The GMM is introduced to non-linearly propagate, in the two-dimensional Cartesian space, results acquired from the Random Forest model 303. More specifically, a GMM is built comprised of mixture components (line 1). Each mixture component is constructed from a Gaussian distribution with a mean μ(ρ) (line 2), corresponding to the Cartesian coordinates in the training data, covariance (line 3), and mixture weights that are acquired directly from the voting scheme of the Random Forest (line 4). [0066] Line 4 highlights a key difference between the classification and regression methodologies, which arises from the fact that taking the mode of the results of the Random Forest, as is done in classification, discards valuable information that is instead exploited in the novel regression algorithm. In some sense, the mixture weights are proportional to the belief of the Random Forest as being at location p given the observation Z (i.e., φψ) = P p i I)).

Algorithm 1 Construct- GMM (^)

for P 1 to do

φίρ^") «- Pip i Z} «- Random-Forest-Predict(^)

end for

return ^mm *- Build-GMM& 40

[0067] Algorithm 2, shown below, provides pseudo-code for the remainder of the regression algorithm. Once the GMM is built (line 1), various approaches are available. In one approach, the weighted mean of the model or drawing samples is taken from the GMM with probability <£¾j)and the weighted mean of the samples is computed. Instead of sampling from the GMM, the regression algorithm disclosed here uses a k Nearest Neighbor Search (line 2) to provide k samples that are not only dependent on the observation∑, but also may come from a different model than the Random Forest. This step adds robustness to the algorithm by combining two of the presented classification algorithms. In other words, where and when one algorithm might fail, the other might succeed. The choice of the Nearest Neighbor Search for this step (as opposed to the Gaussian Model or the Support Vector Machine) comes from the fact that sj times more samples can be drawn from it (a total of iisi; x \?\, as opposed to The returned Cartesian coordinate (line 3) is finally calculated as the weighted mean of the k Nearest Neighbors, where the weight of each Nearest Neighbor is set from the Probability Distribution Function (PDF) of the GMM.

[0068] The entire regression algorithm includes two parameters to be set, a (Algorithm 1) and ¾ (Algorithm 2). The parameter t? dictates how much the

Gaussian components influence each other and may be approximately set to the distance between WiFi readings in the training data. The parameter k may be as high as possible in order to provide a large number of samples, yet low enough not to incorporate "neighbors" that are too far away. In one embodiment, setting k to 25% of all the observations in Γ (i.e., k = 0,23 x |s x fjjf) is a workable solution for this tradeoff.

Algorithm 2 Regression^ <^)

gm ^ Construct-GMM(^) // See Algorithm 1

™^«- k-NN(^{T< 2}-*)

return C

[0069] Given a new observation, I, the regression essentially maps a three- dimensional surface to the X-Y Cartesian space of the environment, where the higher the Z-value of the surface, the more probable the X-Y location. A representative snapshot of the process is shown in FIG. 7, highlighting a behavior of the disclosed regression algorithm. The algorithm is not only capable of generating Cartesian coordinates that were not part of the initial training data, but the algorithm also considers neighboring votes from the Random Forest classification. For example, FIG. 7 shows an example of the highest Random Forest vote (represented by a "*" at element number 703) that is somewhat isolated and, consequently, its neighbors do not contribute to the overall surface 701. The region close to the actual location of the robot (denoted by a "+" at element number 705), however, is comprised of many local neighbors whose combined votes outweigh those of the Random Forest classification, resulting in a better regression estimation (denoted by an "x" at element number 707).

Results of the Regression Analysis

[0070] In order to evaluate an overall accuracy level of the disclosed regression algorithm, new data was gathered that mimicked a localizer robot exploring an environment and localizing it with the regression algorithm trained on Γ. Consequently, 10 new runs (Λ^Γ·) were performed, assuming no prior knowledge of the locations covered by Γ. In other words, while the mapper robot data (Γ) and the localizer robot data (N) covered approximately the same environment, but they were not acquired at the same Cartesian coordinates. As such, the experiment provided a direct measure of the strength of the regression algorithm. For evaluation purposes, the ground truth position of the localizer robot was manually recorded at discrete locations so as to quantitatively evaluate the regression algorithm.

[0071] FIG. 8 shows an average regression error graph 800 for the average accuracy (from the 10 runs and 50 random samples used for training) of the four best classification algorithms compared against the disclosed regression algorithm 801. Not surprisingly, the disclosed regression algorithm 801 worked very well by outperforming the Random Forest model 303, the Gaussian model 305, the Support Vector Machine model 307, and the Nearest Neighbor Search model 309 classification algorithms by 29.77%, 37.95%, 34.07%, and 28.23%, respectively. Additionally, the disclosed regression algorithm 801 only took 131 ms to localize. Monte Carlo Localization (MCL)

[0072] Although the disclosed regression algorithm clearly improves the WiFi localizer's accuracy for real- world scenarios, further improvements can be obtained by taking into account the spatial and temporal constraints implicitly imposed by robot motion. In other words, the classification and regression algorithms discussed so far solved the localization problem without taking into consideration the previous location of the robot, or, more precisely the probability distribution of the previous location of the robot. Specifically, we used an MCL algorithm, built from standard implementations known independently in the art, with particular modifications. A motion model exploits translational and rotational velocities of the robot is the same or similar to the standard implementations. A measurement model, however, takes into account the aforementioned WiFi localizer regression.

[0073] The pseudo-code is presented in Algorithm 3, below, and shows the effectiveness of the GMM implementation (line 1), which can seamlessly transition from regression to MCL. Indeed, the measurement model needs to assign each particle (line 2) the probability of being at the state of the particle given the sensor measurement 2. Thanks to the GMM, the particle's weight is easily retrieved by using the PDF (line 3). Algorithm 3 Measurement-Model(Z) gmm — Construct-GMM(Z) // See Algorithm 1 for m <— 1 to I m | do

m.weight <— POF(gmm, m. state (X,Y ))

end for

[0074] In addition to the measurement model, the initialization procedure of the particles may be modified. Instead of randomly or uniformly sampling the states of the particles and giving each particle an equal weight, the robot can be forced to perform a measurement reading, Z, and initialize the particles using that measurement, as shown in Algorithm 4, below. The GMM (line 1) is constructed and, for each particle in the filter (line 2), a data point is sampled from the GMM that will serve as the X,Y state (line 3) of the particle. Since WiFi signal strengths cannot infer the rotations of the robot, the Θ state (line 4) of the particle is randomly sampled. The weight of the particles is calculated from the GMM's PDF (line 5). In order to add robustness to the previously mentioned problem with the rotation of the robots, the best particles (line 7) are selected and y new particles are added that are the same as v but have ' different S states each randomly sampled between 0 and 2π (line 8). Since the total number of particles has been augmented, the particle initialization is finalized by keeping the best &l particles (line 9).

Algorithm 4 Initialize-Particles(Z)

gmm — Construct-GMM(Z) // See Algorithm 1 for m <— 1 to | m | do

m.state (X,Y ) <— sample(gmm)

m.state (Θ) <— rand(0 to 2π)

m.weight <— POF(gmm, m.state (X,Y ))

end for

for all m with the top | υ | weights do

Add y particles n^y where

n^y <— m, and

n^:v.state(0) - rsad^y (0 to 2π)

end for

Select the | m | particles with highest weights

Results of the Monte Carlo Localization (MCL)

[0075] In order to evaluate the MCL, the same experimental data used for the regression results were used here. Namely, the algorithm is trained on Γ and each robot was localized for each of the 10 additional robot runs. In this example, the MCL updated 5000 particles (i.e., I^ji = 5000), where the computational time took, on average, 2.1 ms and 145.6 ms for the motion model sampling and measurement model, respectively. In order for the MCL to output a Cartesian coordinate, a weighted average was taken of all particles, the result of which could be checked against ground truth. Due to the random nature of the MCL, each of the 10 runs was localized 100 different times. The average mean for the 100 trials of all 10 runs was 0.61 meters, with a low average standard deviation of 0.0049 meters. Comparing these results to an offline laser range finder (LRF) SLAM algorithm, known independently in the art, which produced an average error of 0.27 meters, a compelling tradeoff was observed between accuracy and sensor payload (or price) when using the proposed WiFi localizer. [0076] An illustrative example of a partial run is shown in FIG. 9. FIG. 9 shows a localized run showing the output from an LRM SLAM 905, a WiFi localizer employing the MCL 907, and the ground truth 909 with a number of structures 910 or other obstructions (e.g., walls in a building).

[0077] FIG. 10 shows three non-overlapping localized outdoor runs showing the output of the WiFi localizer with MCL 1009 compared with GPS points 1007. FIG. 10 therefore provides a visualization of three runs (only one run is actually shown to prevent too much overlap) utilizing the WiFi localizer with MCL 1009 in an outdoor environment.

[0078] FIG. 11 shows an exemplary mapping flowchart 1100 of a generalized method for collecting WiFi measurements and developing a WiFi map. At operation 1101 , WiFi measurements are collected and the associated data are geotagged to relate the associated data to a specific geographic location.

[0079] In various embodiments, measurement collection can be performed in several ways. In one embodiment, a stop-and-scan method may be

implemented. In the stop-and-scan method, a user remains in a given physical location in the environment until a preset number of WiFi scans are executed. The WiFi scans are then tagged with the geographical coordinates of that location. In an embodiment, the coordinates can be entered manually by the user, for instance, by finding the location in the blueprint/map and clicking on the location. In other embodiments, the coordinates may be computed automatically if the position of the user can be calculated by an estimator, (e.g. a Bayes filter, accurate GPS, or any other auxiliary localization technology). The method may then continue to either optional operation 1103 or follow path 1110 to operation 1105. [0080] If the optional operation 1103 is chosen, then the collected WiFi scan locations may optionally be clustered at the optional operation 1103. The optional clustering operation may be utilized to encapsulate multiple WiFi scans in each cluster and associate multiple scans with the coordinates of that cluster center. One goal of this step is to capture a variance in WiFi signal strength values using multiple scans. [0081] After either the optional operation 1103 or, should the optional operation 1103 not be chosen, after operation 1101, the method continues through the path 1110 (after operation 1101) to developing a WiFi signature map by training a Random Forest at operation 1105. Various embodiments for training a Random Forest have been discussed herein. The Random Forest is trained using the collected WiFi readings, from operation 1101 , as classification data. The associated coordinates of the corresponding locations or cluster IDs created in the previous step may be used as classification labels. The operation 1105 may directly be implemented on a mobile device if the device is equipped with sufficient computational power. Alternatively, the raw data may be uploaded to either a local or remote server (e.g., hardware servers). The training may then be performed on the server.

[0082] After the WiFi signature map is developed at operation 1105, the method continues at operation 1107 where the trained Random Forest data are stored, along with corresponding metadata (e.g., such as the geotagging information, relative signal strength values, identification of one or more access points, etc.). A generated Random Forest structure along with raw data and associated metadata may then be stored in, for example, a remote server.

Additional associated metadata may contain information including, but not limited to, coordinates of cluster centers, cluster- WiFi scan membership associations, blueprint of the venue being mapped, GPS coordinates of the venue, information related to the venue (e.g., the venue name, description, history, internal and external photos, etc.), borders of regional spaces of interest (e.g. rooms, stores, aisles etc. within the venue) and information associated with the regional spaces (e.g., room number, building name, store name, store information, etc.)

[0083] Referring now to FIG. 12, an exemplary localization flowchart 1200 of a more detailed method for collecting WiFi measurements and developing a WiFi map is shown, including various optional steps. At operation 1201, measurements of WiFi signals are collected. The collection of measurements may be repeated, as indicated by a loop 1210, as many times as needed depending upon a level of accuracy and an overall area of the location to be mapped as discussed above. For example, for certain applications, the level of accuracy may be sufficient if one or more access points are located only once every 100 meters. However, in other applications, each of the one or more access points may need to be determined within one meter intervals. Therefore, the number of collected measurements may vary considerably. Further, an area over which the WiFi mapping occurs may be over several hundreds of square meters or over hundreds of square kilometers. In other applications, perhaps only a single level, or portion of a level, within a building needs to be mapped.

[0084] Once the measurements are collected at operation 1201, the venue may be identified at operation 1203. For example, data may be stored from a number of collected measurements, taken at operation 1201. However, in certain embodiments, only a subset of the data may be needed for a particular application. For example, even though measurements may have been collected over an area encompassing several city blocks, a particular application may only need mapping of the collected measurements for a relatively small downtown area located within the several block area. Therefore, in this example, the venue is identified as "City Center." Collected WiFi measurements may be used to identify the venue in which the user is localizing. The identification can be done in several ways including, but not limited to, GPS coordinates of a mobile device, comparing captured unique MAC addresses of access points with addresses stored in a database or server, or by using a classifier as described above.

[0085] Once the venue is identified at operation 1203, if a blueprint or the building or map of the geographical area is not already stored on a mobile device used to collect the measurements at operation 1201 , the blueprint or map of any type may be downloaded to the device at operation 1205. Alternatively, a blueprint or map of the area may later be downloaded to overlay the locational data on, for example, a blueprint of a building or a map of, for example, "City Center." The blueprint may then loaded to the screen and viewed on the mobile device.

[0086] At operation 1207, the collected measurements from operation 1201 are computed and estimated within the blueprint or map using the trained Random Forest for the location of interest. Since localization may be considered as a classification issue at operation 1207, the estimated location may be one of the cluster centers used as classification labels in the mapping process. This operation may be executed on, for example, the server identified above. For execution, the collected measurements from operation 1201 may be uploaded to the server. Alternatively, the location may be computed on the mobile device if the device already contains the trained Random Forest.

[0087] After the location is computed at operation 1207, the localization flowchart 1200 may continue with an optional primary location refinement, at optional operation 1209. Alternatively, after the location is computed, the location information may be updated at operation 1213 by following path 1220. A person of ordinary skill in the art will understand, for a given situation and level of required accuracy, whether or not to include the optional operations based upon reading and understanding the material provided herein.

[0088] If the user makes a decision to invoke the primary location refinement, at optional operation 1209, votes collected from each tree in the Random Forest can be used in a regression step implemented in optional operation 1209 in order to improve the localization accuracy.

[0089] After the primary location refinement operation is completed, the localization flowchart 1200 may continue with an optional secondary location refinement, at optional operation 1211. Alternatively, after the primary location refinement of optional operation 1209 is completed, the location information may be updated at operation 1213 by following path 1230.

[0090] If the user makes a decision to invoke the optional secondary location refinement, at optional operation 1211, the location refinement estimation of optional operation 1209 may be improved by, for example, using a Bayes Filter. The Bayes filter considers temporal and spatial coherence to eliminate false instantaneous localizations and to further improve the accuracy. One possible implementation of the optional operation 1211 implements a Monte Carlo Localization (particle filter) as discussed above.

[0091] Once the optional secondary location refinement operation is completed, the localization flowchart 1200 progresses to operation 1213 where location information may be retrieved (e.g., wirelessly through the Internet) either automatically or manually with the user provided inputs or responding to prompts for input (e.g., through a graphical-user interface (GUI)). Information previously stored in, for example, the mobile device, may also be updated based on the computed or computed and refined versions of the localization data. The updated information may then be stored (e.g., to the mobile device or the server, identified above). The final localization estimation may then be displayed to the user on the screen of the mobile device. Visualization of this information can be in various forms including, but not limited to, a beacon showing the latest location of the user, the path or paths the user followed, textual information about the location (e.g., x- and y-coordinates of the location of the user), topological location information (e.g., an identification of a building in which the user is located such as "you are: in store X, in room Y, and in aisle Z," etc.). The localization flowchart 1200 may then continue through path 1240, to collect additional measurements at operation 1201. Alternatively, the localization flowchart 1200 may continue to a display location-based information step at optional operation 1215.

[0092] At optional operation 1215, the updated information based on the localization estimation may be displayed on, for example, the screen of the mobile device, either manually, at the request of the user, or automatically. The location-based information may vary tremendously based, at least partially, on the venue in which the user is located and may include, for example, location- based advertisements, turn-by-turn directions along with a path to be followed to reach desired locations, descriptions of surrounding items in, for example, a museum or store, restaurants in a mall, departing-arriving flight information with nearby gates/restaurants/restrooms and so forth in an airport, a directory of the doctors and offices in a hospital, or schedule of the talks with corresponding rooms in a conference center. A skilled artisan can envision many other types of information available based on the description provided herein. After optional operation 1215, the localization flowchart may continue back through path 1250 to collect additional measurements at operation 1201.

[0093] A person of ordinary skill in the art, upon reading and understanding the information disclosed herein, will recognize if the optional steps need to be performed and, if so, when the optional steps may be implemented. The localization flowchart 1200 presents only one possible scenario to implement the method for collecting WiFi measurements, developing a WiFi map, and providing information to the user. In other embodiments, the operations may occur in a different order. For example, a skilled artisan will recognize that the venue may be identified prior to collecting measurements. The download of the blueprint for the venue may occur either prior or subsequent to the

measurements being collected. Also, one or both of the optional steps of location refinement may occur prior or subsequent to the display of the location- based information based on, for example, the desired level of accuracy with which the location-based information is displayed. Therefore, the localization flowchart 1200 merely provides one embodiment in which the method may be practiced.

[0094] Various embodiments discussed herein may be combined, or elements selectively chosen to be adapted into a new embodiment. Thus, many more permutations are possible beyond those explicitly discussed.

[0095] Therefore, while various embodiments of the inventive subject matter are described with reference to assorted implementations and exploitations, it will be understood that these embodiments are illustrative only and that a scope of the inventive subject matter is not limited merely to those described embodiments. Moreover, the systems and methods described herein may be implemented with facilities consistent with any hardware system or hardware systems either defined herein or known independently in the art using techniques described herein. Many variations, modifications, additions, and improvements are therefore possible based upon an understanding of the concepts and techniques expressed herein. [0096] For example, using the novel hybrid algorithm that has been disclosed, mixing classification and regression methods, capable of localizing, with sub-meter accuracy, robots with a minimal sensor payload consisting of a WiFi card and odometry may be employed. As discussed herein, the end-to-end WiFi localizer compares favorably to previously published algorithms.

However, the performance of end-to-end WiFi localizer is competitive with the full LFR SLAM solution. The evaluation and comparison of the WiFi localizer against both published and new algorithms, along with experiments performed on different datasets, provide compelling evidence regarding the robustness of the approach detailed herein. Moreover, the algorithm is fast enough to ensure real-time mapping and localization. Additional embodiments can be readily envisioned based upon reading and understanding the material disclosed herein.

[0097] For example, the mapping algorithm may be modified to be performed incrementally, allowing the mapper robot to provide the localizer robots with partial maps or sections of the environment. Random Forests are still be a viable option in that scenario, although they may benefit from some modifications to increase the number of trees when new data are available and to prune older trees that, over time, will not encompass enough information.

Additionally, WiFi localization, thanks to its low sensor requirement commonly found in low-cost robots, is a great middle-layer to merge heterogeneous maps together. Indeed, WiFi localization can solve, for example, the difficult problem of merging a Cartesian grid map with a topological or visual map. Furthermore, if needed, the algorithms presented can be made faster by parallelizing the training process.

Modules, Components, and Logic

[0098] Additionally, certain embodiments described herein may be implemented as logic or a number of modules, components, or mechanisms. A module, logic, component, or mechanism (collectively referred to as a

"module") may be a tangible unit capable of performing certain operations and is configured or arranged in a certain manner. In certain exemplary embodiments, one or more computer systems (e.g., a standalone, client, or server computer system) or one or more components of a computer system (e.g., a processor or one or more processors) may be configured by software (e.g., an application or application portion) or firmware (note that software and firmware can generally be used interchangeably herein as is known by a skilled artisan) as a module that operates to perform certain operations described herein.

[0099] In various embodiments, a module may be implemented

mechanically or electronically. For example, a module may comprise dedicated circuitry or logic that is permanently configured (e.g., within a special-purpose processor) to perform certain operations. A module may also comprise programmable logic or circuitry (e.g., as encompassed within a general-purpose processor or other programmable processor) that is temporarily configured by software or firmware to perform certain operations. It will be appreciated that a decision to implement a module mechanically, in the dedicated and permanently configured circuitry, or in temporarily configured circuitry (e.g., configured by software) may be driven by cost and time considerations.

[00100] Accordingly, the term module should be understood to encompass a tangible entity, be that an entity that is physically constructed, permanently configured (e.g., hardwired), or temporarily configured (e.g., programmed) to operate in a certain manner and/or to perform certain operations described herein. Considering embodiments in which modules or components are temporarily configured (e.g., programmed), each of the modules or components need not be configured or instantiated at any one instance in time. For example, where the modules or components comprise a general-purpose processor configured using software, the general-purpose processor may be configured as respective different modules at different times. Software may accordingly configure the processor to constitute a particular module at one instance of time and to constitute a different module at a different instance of time.

[00101] Modules can provide information to, and receive information from, other modules. Accordingly, the described modules may be regarded as being communicatively coupled. Where multiples of such modules exist

contemporaneously, communications may be achieved through signal transmission (e.g., over appropriate circuits and buses) that connect the modules. In embodiments in which multiple modules are configured or instantiated at different times, communications between such modules may be achieved, for example, through the storage and retrieval of information in memory structures to which the multiple modules have access. For example, one module may perform an operation, and store the output of that operation in a memory device to which it is communicatively coupled. A further module may then, at a later time, access the memory device to retrieve and process the stored output.

Modules may also initiate communications with input or output devices and can operate on a resource (e.g., a collection of information).

Exemplary Machine Architecture and

Machine -Readable Storage Medium

[00102] With reference to FIG. 13, an exemplary embodiment extends to a machine in the exemplary form of a computer system 1300 within which instructions, for causing the machine to perform any one or more of the methodologies discussed herein, may be executed. In alternative exemplary embodiments, the machine operates as a standalone device or may be connected (e.g., networked) to other machines. In a networked deployment, the machine may operate in the capacity of a server or a client machine in server-client network environment, or as a peer machine in a peer-to-peer (or distributed) network environment. The machine may be a personal computer (PC), a tablet PC, a set-top box (STB), a Personal Digital Assistant (PDA), a cellular telephone, a web appliance, a network router, a switch or bridge, or any machine capable of executing instructions (sequential or otherwise) that specify actions to be taken by that machine. Further, while only a single machine is illustrated, the term "machine" shall also be taken to include any collection of machines that individually or jointly execute a set (or multiple sets) of instructions to perform any one or more of the methodologies discussed herein.

[00103] As shown in this example, the computer system 1300 includes a processor 1301 (e.g., hardware-based processor such as a central processing unit (CPU), a graphics processing unit (GPU) or both), a main memory 1303 and a static memory 1305, which communicate with each other via a bus 1307. The computer system 1300 may further include a video display unit 1309 (e.g., a liquid crystal display (LCD) or a cathode ray tube (CRT)). The computer system

1300 also includes an alphanumeric input device 1311 (e.g., a keyboard), a user interface (UI) navigational or cursor control device 1313 (e.g., a mouse), a disk drive unit 1315, a signal generation device 1317 (e.g., a speaker), and a network interface device 1319.

Machine -Readable Medium

[00104] The disk drive unit 1315 includes a tangible machine-readable medium 1321 on which is stored one or more sets of instructions, operations, or data structures (e.g., software 1323) embodying or used by any one or more of the methodologies or functions described herein. The tangible machine-readable medium is non-transitory in that it does not embody a signal. However, labeling the tangible medium as "non-transitory" should not be construed to mean that the medium is incapable of movement - the medium should be considered as being transportable from one physical location to another. Additionally, since the medium is tangible, the medium may be considered to be a machine-readable device.

[00105] The software 1323 may also reside, completely or at least partially, within the main memory 1303 or within the processor 1301 during execution thereof by the computer system 1300; the main memory 1303 and the processor

1301 also constituting machine-readable media.

[00106] While the tangible machine-readable medium 1321 is shown in an exemplary embodiment to be a single medium, the term "tangible machine- readable medium" may include a single medium or device, or multiple media or devices (e.g., a centralized or distributed database, or associated caches and servers) that store the one or more instructions. The term "tangible machine- readable medium" shall also be taken to include any tangible medium or device that is capable of storing, encoding, or carrying instructions for execution by the machine and that cause the machine to perform any one or more of the operations or methodologies of the present invention, or that is capable of storing, encoding, or carrying data structures used by or associated with such instructions. The term tangible machine-readable medium or devices shall accordingly be taken to include, but not be limited to, solid-state memories, and optical and magnetic media. Specific examples of tangible machine -readable media or devices include non- volatile memory, including by way of example semiconductor memory devices (e.g., EPROM, EEPROM, and flash memory devices); magnetic disks such as internal hard disks and removable disks;

magneto-optical disks; and CD-ROM and DVD-ROM disks.

Transmission Medium

[00107] The software 1323 may further be transmitted or received over a communications network 1325 using a transmission medium via the network interface device 1319 utilizing any one of a number of well-known transfer protocols (e.g., HTTP). Examples of communication networks include a local area network (LAN), a wide area network (WAN), the Internet, mobile telephone networks, Plain Old Telephone (POTS) networks, and wireless data networks (e.g., WiFi and WiMax networks). The term "transmission medium" shall be taken to include any intangible medium that is capable of storing, encoding, or carrying instructions for execution by the machine, and includes digital or analog communications signals or other intangible medium to facilitate communication of such software. [00108] Although an overview of the inventive subject matter has been described with reference to specific exemplary embodiments, various modifications and changes may be made to these embodiments without departing from the broader scope of the present invention. Such embodiments of the inventive subject matter may be referred to herein, individually or collectively, by the term "invention" merely for convenience and without intending to voluntarily limit the scope of this application to any single invention or inventive concept if more than one is, in fact, disclosed. [00109] The embodiments illustrated herein are described in sufficient detail to enable those skilled in the art to practice the teachings disclosed. Other embodiments may be used and derived therefrom, such that structural and logical substitutions and changes may be made without departing from the scope of this disclosure. The Detailed Description, therefore, is not to be taken in a limiting sense, and the scope of various embodiments is defined only by the appended claims, along with the full range of equivalents to which such claims are entitled.

[00110] Moreover, plural instances may be provided for resources, operations, or structures described herein as a single instance. Additionally, boundaries between various resources, operations, modules, engines, and data stores are somewhat arbitrary, and particular operations are illustrated in a context of specific illustrative configurations. Other allocations of functionality are envisioned and may fall within a scope of various embodiments of the present invention. In general, structures and functionality presented as separate resources in the exemplary configurations may be implemented as a combined structure or resource. Similarly, structures and functionality presented as a single resource may be implemented as separate resources.

[00111] These and other variations, modifications, additions, and improvements fall within a scope of the inventive subject matter as represented by the appended claims. The specification and drawings are, accordingly, to be regarded in an illustrative rather than a restrictive sense.

Claims

CLAIMS What is claimed is:

1. A method of determining localization in an unknown environment, the

method comprising:

collecting a plurality of WiFi signal strength measurements from a plurality of access points at various locations of a robot within the unknown environment;

collecting data from motion sensors within the robot to determine a change in a position of the robot within the unknown environment as a function of time;

applying a regression algorithm to the plurality of WiFi signal strength

measurements and the motion sensor data to determine a plurality of locations of the robot with reference to the plurality of WiFi signal strength measurements; and

developing a WiFi signature map of the unknown environment from an output of the regression algorithm.

2. The method of claim I, wherein data regarding locations of the plurality of access points are unavailable prior to the developing of the WiFi signature map.

3. The method of claim I, further comprising performing a Monte Carlo

localization analysis in addition to the regression algorithm to determine the changes in the position of the robot as a function of time.

4. The method of claim I, wherein the WiFi signature map of the unknown environment is developed substantially in real time as the robot moves within the unknown environment.

5. The method of claim I, wherein the plurality of WiFi signal strength

measurements is collected without connecting to any of the plurality of access points.

6. The method of claim 1, further comprising correlating a spatial coordinate determined from the collected motion sensor data from the robot with the plurality of WiFi signal strength measurements collected from the plurality of access points.

7. The method of claim 6, further comprising creating a signal map from the plurality of WiFi signal strength measurements.

8. The method of claim I, further comprising:

determining an observation vector for each of the plurality of access points; and correlating the observation vector with each of the determined changes in

position of the robot as a function of time.

9. The method of claim 8, further comprising dynamically increasing the

number of observation vectors as additional access points are sensed by the robot moving within the unknown environment.

10. The method of claim I, further comprising improving an accuracy of the determined plurality of locations of the robot by collecting votes from a plurality of decision trees and applying the collected votes to a probabilistic model.

11. The method of claim 10, further comprising generating Cartesian

coordinates for each of the plurality of locations of the robot from the probabilistic model.

12. The method of claim I, further comprising maintaining the robot in a

stationary position for each of a plurality of positions to collect a predetermined number of WiFi scans thereby adding additional WiFi signal strength measurements to the plurality of WiFi signal strength measurements to increase an accuracy of the developed WiFi signature map of the unknown environment.

13. The method of claim 1 , further comprising:

uploading the plurality of WiFi signal strength measurements and the motion sensor data to a remote computer;

applying the regression algorithm at the remote computer; and

downloading a result of the regression algorithm to the robot to develop the

WiFi signature map of the unknown environment.

14. The method of claim 1 , further comprising applying a Bayes filter to the motion sensor data to reduce instances of false instantaneous localization data.

15. A tangible computer-readable medium having no transitory signals and storing instructions that, when executed by one or more processors, cause the one or more processors to perform operations, the operations comprising:

applying a regression algorithm to the plurality of WiFi signal strength

16. The tangible computer-readable medium of claim 15, wherein the WiFi signature map is developed for an indoor environment without receiving global positioning system (GPS) information regarding the plurality of locations of the robot.

17. The tangible computer-readable medium of claim 15, wherein the plurality of WiFi signal strength measurements is collected while the robot is substantially continuously in motion.

18. The tangible computer-readable medium of claim 15, further comprising: uploading the plurality of WiFi signal strength measurements and the motion sensor data to a remote computer;

applying the regression algorithm at the remote computer;

developing the WiFi signature map of the unknown environment at the remote computer; and

downloading the WiFi signature map of the unknown environment to the robot.

19. The tangible computer-readable medium of claim 15, further comprising applying the regression algorithm and developing the WiFi signature map within the robot.

20. The tangible computer-readable medium of claim 15, wherein the WiFi signature map is developed for an outdoor environment.

21. The tangible computer-readable medium of claim 20, further comprising applying the WiFi signature map of the outdoor environment to improve a locational accuracy of collected global positioning system (GPS) readings.

22. The tangible computer-readable medium of claim 15, further comprising displaying the WiFi signature map onto an electronic device of an end- user and supplying turn-by-turn directions to the end-user to reach one or more desired locations within the unknown environment.

23. A method of determining localization in an unknown environment, the method comprising:

collecting a plurality of radio frequency signal strength measurements from a plurality of access points at various locations from each of a plurality of robots within the unknown environment;

collecting motion sensor data from motion sensors within each of the plurality of robots to determine a change in a position of each robot as a function of time within the unknown environment; and

applying a regression algorithm to the plurality of radio frequency signal

strength measurements and the motion sensor data to determine locations of each of the plurality of robots with reference to each of the plurality of radio frequency signal strength measurements received by a respective one of the robots.

24. The method of claim 23, further comprising developing a signature map of the unknown environment from an output of the regression algorithm.

25. The method of claim 23, further comprising equipping each of the plurality of robots with a sensor payload including a WiFi card and an odometry sensor.

26. The method of claim 25, wherein each of the plurality of robots

communicates with others of the plurality of robots through respective ones of the WiFi cards.