Monday, March 15, 2010

Optimization algorithms using OpenStreetMap data

Recently I've been making some edits to the map data served up by the good folks at OpenStreetMap.org

View Larger Map
OSM is an open, free map of the world built by thousands (over 220,000 as of this post) of volunteers worldwide. When you visit the site, you're greeted by a user friendly mapping experience not unlike the basic features of any of the modern online map browsers. But what makes OSM really special for us spatial data analysis researchers and practitioners is

Tuesday, May 5, 2009

Google shows trends in Ops Research activity

How much can you tell by the volume of Google searching on a particular topic? Apparently enough to predict a flu epidemic faster than the CDC. So I thought it would be interesting to search for a few terms in Google Trends to see if there were any interesting results. The volume of searches for "operations research" has been steadily declining during the period from 2004 to the present.













Interesting...

Saturday, December 20, 2008

GIS at massive scales... who will get us there? Understanding high performance computing for GIS.

As a GIS practitioner for over ten years I've been subject to all of the same limitations and headaches as my peers. GIS has some powerful capabilities for analysis and research, but boy can it be slow. Many GIS engines have operations that work in active memory, others are limited to working on single files or databases, and we've all experienced the blasted slowness of it all. So how can we get over this hurdle? We've taken major strides in the digital mapping field in the last several years - tiled map interfaces are commonplace online now, and the user experience is far better than the first generation mapping services; analytically we're doing more every day. ESRI's toolbox capabilty in the 9.x series allows them and others to develop special purpose toolsets that perform almost any spatial operation known. But how can we do more. Every GIS practitioner I know wishes that we could perform all these calculations on more and more data at faster and faster speeds. As GIS capability grows, so does the demand for higher resolution data. As a result, we've been at a standstill in terms of processing speed for 5 years or so.

Wednesday, August 20, 2008

Using the open source R project for spatial statistics

This info was originally posted here, but since most people would never find that link, I decided to report it here where there's a bit more traffic. The open source R language and statistical package has many useful functions for investigating spatial data. I recommend it, and use R extensively for prototyping new algorithms for spatial statistics, forecasting, and machine learning. More after the jump...

Thursday, August 14, 2008

Machine Learning Applied to Remote Sensing: Part 1 - Imagery


Remote sensing is the method of collecting and analyzing data using mechanical or electronic sensors. Commonly the industry considers imagery to be the main remote sensing medium. A new age of understanding the world was ushered in when French photographer Gaspar Felix Tournachon (a.k.a. Nadar) first strapped a camera to the basket of a tethered balloon in 1858 - and when James Black's glass negative camera took panorama shots of Boston in 1850. They were pioneering photographers inventing a new way to collect data about the world around us in the form of aerial imagery. Since that time, new methods of flight (the Wright brothers did some of the first photography from an airplane) and improvements in technology led to greater accuracy, resolution, clarity, and a useful resource for everything from natural resource planning, to fighting wars (US Civil War, to the Middle East conflicts of today). Today's remote sensing data consists of not only aerial imagery, but satellite imagery, RADAR, LIDAR, SONAR, and many other sensors are available. In fact, the evolution of operations research begins with sonar sensor operations. Questions of how to optimally deploy and interpret these data yielded the discipline of using applied mathematics to determine operational parameters of these systems and OR was born.

That legacy lives on today. Globally there are more than 500TB (my swag estimate from company websites) of imagery collected each month from commercial imagery providers. Who knows how much the world's governments are collecting. It almost certainly dwarfs that number. Having people visually inspect all that data is a meaningless task. It's error prone, slow, and inefficient. That's where OR and specifically techniques in machine learning and computer vision come in. Some basic operations that can be performed on imagery that add value are edge detection, class segmentation, terrain models, watershed analysis, line of sight, slope, aspect, and impervious surface models.

Wednesday, July 30, 2008

Preparing for the ESRI Intl User Conference

Next week is the big ESRI User Conference event. I've been to this show several times over the last 10 years and it keeps getting bigger and broader. If you're going, you'll have the opportunity to talk to researchers and practitioners in your field from all over the world, but you'll have to find them first. There are over 10,000 people at the UC and it can get a little overwhelming. Here are some tips to keep your sanity and focus.
  1. Get there early in the morning each day. Most people trickle in throughout the day, due to jetlag, sunshine-itis (the UC is in beautiful SanDiego), or trying to catch up on work from their hotel room. Don't do it! Get to the conference center early, get registered, and start wandering.
  2. Plan out your day ahead of time. Most of the sessions are handily arranged in tracts, so if you're a hydrologist, you won't have to go far to see talks in your field. But when you're primary focus is spatial research (and yours is, right?), then you may have to hustle to see all the good stuff. There will be good research talks in all domains, and they may be spread out pretty far at the conference center. Allow yourself some walking time, and pick out which talks you want to go to.
  3. Don't be afraid to get up and leave a session. If I'm not speaking at a session, I like to sit near the side or back so i can bail out quickly and quietly if the talk isn't what I expected. Naturally I understand if someone gets up and leaves one of my talks - there's a lot going on, and you can only see a small fraction of it. Make your trip worthwhile and see the sessions, demos, and vendors that are of the greatest interest to you.
  4. Mingle at night! Don't sit in your hotel. Nearly everyone there is from out of town, and San Diego is a great place to explore on foot. Meet up with some people and hit the streets in the gaslamp district or down by the waterfront south of the conference center/hotel.
  5. Take notes. Paper and pens are available at every session. You'll never remember close to everything you hear, so take notes, and review them when you get back. Follow up with an email or phone call to the speaker if you have questions. Speakers love to get questions, so feed their egos and learn something in the process.
  6. Talk to the speakers after the session. If something strikes you as particularly interesting, don't wait until after the conference to contact a speaker, do it right then. Strike up a conversation and maybe you'll be able to follow up over lunch or at a break.
I'm looking forward to this year's show. Although there are a WIDE array of topics, and not all will interest you, there are some real gems each year and it's worth it to mine those out. And yes, I'm giving a talk too, so if you read this site, say 'Hi'.

Saturday, July 12, 2008

IEEE VAST Symposium Challenge


Last night my team and I finished submitting our entries for the IEEE Visual Analytics Science and Technology Grand Challenge. This year's contest was made of four mini-challenges and a grand challenge which ties everything from the other challenges together. The data was well thought out, and the problem overall had a good mix of easy problems to get started and more challenging ones to strive for. I'm sure we didn't find all the answers possible in the data, and we went down many analytical paths that didn't pan out, but that's the way it works in practice, so it was a good experience. Two of the mini-challenges had a spatial-OR approach. The first was an analysis of the illegal immigration patterns of a fictional group called Paraiso. Our team used spatio-temporal variograms to examine the strength of the migration patterns, and animations of the Coast Guard interdiction to examine their success rate.

The second spatial-OR flavored challenge was the fictional account of a bombing of a building. IEEE provided very high quality (unrealistically so) data from 'RFID' tags on the occupants of the building. By examining the patterns of movement, we were to identify potential suspects, witnesses, and casualties in the building.

We'll see how we did in August, and then meet with all the other submitters at the VAST Symposium in October.