I recently had a few days off and managed to sort out the growing collection of photographs accumulating on my hard drive. The collection is almost 150GB with 52000+ image and video files spanning 10 years. I have used a variety of photo management tools over the years including Canon software that came with the camera, FSpot, gThumb, iPhoto and Digikam (the tool of choice). The resulting mess of nested folders and sub-folders demanded some TLC. Thankfully I had a couple of backups on different disks as well as two live working copies so I was safe in case I messed up.
Enter exiftool. A command line tool to manage all aspects of your photo metadata.
I copied my collection to a scratch processing space year by year and processed them in chunks using a single line of exiftool wizardry:
This command recurses (-r) through the input directory finding all supported image and video files. It moves the files to the output folder, creating a YEAR_MONTH sub-folder (%Y_%m) using the original creation date of the file to be moved. The creation date and time (%Y-%m-%d_%H-%M-%S) is prefixed to the original filename (%%f.%%e). For each year of photos I end up with 12 folders (2005_01, 2005_02, etc.) containing all the nicely sorted photos.
Exiftool also reports errors and files it is unable to process and these remain in the input folders after processing making it simple to manually check through them. I also had some success with the remnants using the Last Modified Date.
I recently had to update a live database with updated tables from a staging database and then continue to update on a daily basis. As it is a regular update and the source and destination tables won’t change I generated a text file with a list of layers to process and tables to write. Like this:
The first column is the list of layers in the staging database to process. This is the %G variable in the shell script. The second column is the new table to write, the %H variable.
The initial load read in the layers from the staging database and created them in the live database. I set the progress flag to check it was doing something (this can be deleted), set the geometry column and output schema.
I had a request for some “spider diagrams” showing the connections between service centres and their customers and was given some sample data of about 140000 records.
The data contained a customer ID and customer coordinates and a service centre ID. Using another table of service centres I was able to add and update for each record the service centre coordinates (eastings and northings on the British National Grid EPSG:27700). Continue reading PostGIS Spiders
I have been using pgRouting for some accessibility analysis to various facilities on the network and experimenting with different ways of making the process faster.
My initial network had 28000 edges and to solve a catchment area problem for one location on the network to all other nodes on the network was taking 40 minutes on a 2.93GHz quad core processor with 4GB RAM (Windows 7 PostgreSQL 9.2 PostGIS 2.0.3 and pgRouting 1.0.7). I put the query into a looping function that processed the facilities in order but any more than 4 and the machine would run out of memory as the complete solution is stored in RAM until the loop finishes.
First step, reduce the number of edges in the network to 23000 and number of nodes to 17000 by removing pedestrian walkways, alleys, private and restricted roads. Now the query is solved in about 12-14 minutes using about 200MB RAM per facility. Continue reading Speeding up pgRouting
I am in the process of rendering a series of map tiles based on the OS OpenData products using the gdal2tiles.py script (and an updated version that uses all cores on the machine to speed things up). The different raster products are rendered at different scales and then displayed using LeafletJS and OpenLayers applications as simple demonstrations.
The following command generates the tiles for the zoom levels I need:
A sample page showing a Leaflet JS map using GeoServer WMS with data in British National Grid (EPSG:27700). It uses the Proj4Leaflet plugin to set the display projection to EPSG:27700 as it is not one of the default supported projections.
I recently had to update my workspace that I use to keep track of jobs. I have a job folder (“300”) and three sub-folders (“input, output, working”). To create a new set of folders I use a batch file:
FOR /L %%G IN (300,1,599) DO (
echo Making job folder %%G...
echo Making input folder %%G...
echo Making output folder %%G...
echo Making working folder %%G...