I had the requirement of extracting hospitals from OpenStreetMap into Postgresql so that some data analysts could run some scripts off it.

I fired up an AWS EC2 server (t3.large) running Ubuntu and installed the awscli package and configured it using my AWS Key and Secret Key.

sudo apt-get install awscli
aws configure

Then I downloaded the PBF file from AWS S3 (you don't incur a charge if downloaded from S3 if you are using EC2)

aws s3 cp s3://osm-pds/2018/planet-180903.osm.pbf .

The file size was about 42GB and took 20 minutes to download. Next, I used Imposm3 instead of Osm2pgsql to read the PBF file and import into postgresql server.

imposm import -config config.json -appendcache -read planet-180903.osm.pbf -write

The config.json specified the database and mapping parameters.

{
     "cachedir": "/home/ubuntu/cache",
     "connection": "postgis://osm:osm@localhost:5432/osm",
     "mapping": "mapping.json",
     "srid": 4326
}

Since I was only interested in the location of hospitals and didn't require other tags or admin boundaries, relations etc, this is how my mapping.json looked.

{
     "tags": {"load_all": true},
     "tables": {
         "amenities": {
             "columns": [
                 {
                     "type": "id",
                     "name": "osm_id",
                     "key": null
                 },
                 {
                     "type": "geometry",
                     "name": "geometry",
                     "key": null
                 },
                 {
                     "type": "string",
                     "name": "name",
                     "key": "name"
                 },
                 {
                     "type": "mapping_value",
                     "name": "type",
                     "key": null
                 },
                 {
                     "type": "hstore_tags",
                     "name": "tags",
                     "key": "tags"
                 }
             ],
             "type": "polygon",
             "mapping": {
                 "amenity": [
                     "hospital"
                 ]
             }
         }
     }
 }

The imposm import command took nearly 19 hours to import the data into postgresql. You can push the data into the public schema using the -deploytoproduction flag or just use the import schema if you are comfortable with it.

Additionally, I exported the data from the osm_amenities table into a json file (a json line for every row in the table) using the following PSQL command:

COPY (SELECT row_to_json(t)
      FROM (SELECT id, osm_id, name, type,
                   hstore_to_json(tags) as tags,
                   ST_AsGeoJSON(geometry)::json as geometry
            FROM import.osm_amenities) t
     )
TO '/tmp/output.json';

My Muttrc

Thu 13 September 2018 by Thejaswi Puthraya

I have been using Mutt, a text-based email client for ages for some self-hosted email that I use internally. Someone recently asked me to share my .muttrc and so here goes:

set editor = "emacs -nw"

set mbox_type=Maildir
set mbox="~/Mail/Inbox"
set spoolfile="~/Mail/Inbox"
set folder="~/Mail/"
set …
read more

Port Forwarding with HAProxy

Sat 11 August 2018 by Thejaswi Puthraya

I recently acquired a beefy bare-metal server and wanted to run a bunch of services within VMs based on KVM managed by libvirt. Only ports 80 and 443 for these VMs would be exposed and the rest of the ports (say SSH) visible only from the internal network.

Initially, I …

read more

Postfix SMTP Delegation

Sat 23 December 2017 by Thejaswi Puthraya

Email still continues to be a popular communication and collaboration tool especially within enterprises. And Postfix (configurable SMTP server) has withstood the test of time and remains popular.

Programmable SMTP servers like Haraka, Lamson make building applications on top of email easy but I am not comfortable exposing them as …

read more

Hosting multiple python apps different url mount points

Fri 22 December 2017 by Thejaswi Puthraya

If you host multiple (virtual hosting) python web apps (django, flask etc) behind nginx on a single server, you have two options:

  1. Using subdomains
  2. On different URL mount points

Using subdomains, you will have to update your DNS records for every app that you deploy. While the deployment is cleaner …

read more

Self-signed TLS Certificate in Linux

Sun 21 August 2016 by Thejaswi Puthraya

Today, I had the necessity to embed an iframe on an HTTPS web page and most browsers now won't allow you to embed an HTTP resource on an HTTPS page raising the mixed content warning. Since the code was quite experimental, I didn't want to deploy it out of localhost …

read more

Elasticsearch autocomplete on related keywords

Thu 21 April 2016 by Thejaswi Puthraya

Recently, I had to work on a client request to add autocomplete to their site's search based not just on a specific list but on related keywords.

For example, suppose you had a list of TV series like Simpsons, Futurama, Tom and Jerry etc. The autosearch had to suggest the …

read more

Simple Command Line Dropbox Uploader in Python

Sun 10 April 2016 by Thejaswi Puthraya

I don't use Dropbox frequently. In fact, I would prefer not to use it but work requires me to use it from time to time. My only use case was to upload files onto a shared folder. So for my usage, it makes no sense to run the desktop client …

read more

Nginx Basic Auth without htpasswd-tools

Mon 10 November 2014 by Thejaswi Puthraya

Sometimes you may want to hide your staging server or project under development from search engines and prying eyes but you don't want to take a lot of pain. HTTP Basic Auth is one solution (can be used as a proper authentication mechanism for web-apps as well) to allow only …

read more

Resize Qemu Image

Wed 30 October 2013 by Thejaswi Puthraya

The default cloud image on Ubuntu has a root partition size of 2GB. I wanted a way to resize the root without going through the hassle of setting up LVM.

To be able to resize the qemu images, we need the Guestfish package.

qemu-img create -f qcow2 -b base_disk.img …
read more