Migrate data into mongodb

Posted on January 4th, 2015

Whilst migrating some websites, I wondered how I would handle moving the existing data from mysql to mongodb. (My existing code never dealt with this situation, and would read / write data directly to mongodb.)

So a quick google gave me a few options.

* mongify – fromĀ http://mongify.com/
* mongoimport – comes with mongodb when installed (should be in your path after installing mongodb, in the bin folder e.g. /usr/local/mongodb/bin/mongoimport)
* custom script

Since I had only a very small import, installing mongify seemed like overkill, so I reserved this approach for later, Also, I wanted to have a solution that didn’t have to connect to both mongodb AND mysql. It sounds crazy, but imagine only having mysql on the source, and only having mongodb on the destination – though see my thoughts at the end. Feel free to check it out if you have more serious migrations planned!

I wanted something more like the mysql command line tools, where the mysql command can import a file exported by the mysqldump utility, and mongoimport sounded like a good start. It can import data from json, tsv, csv files created by the mongoexport utility. Trouble is, I wasn’t exporting form mongodb, and a mysqldump file would be no use here.

If I could just find some way to export my data as json, I could then try passing it to mongo. Luckily for me, I had a RESTful API sat in front of the MySQL database, so it was a trivial matter to export the whole data set as a json array using a simple HTTP GET request šŸ˜€

The data set I exported was a single table (a really simple example!) containing a list of wines, which I saved as “wines.json”. (If you wish to replicate, setup the wine cellar example from https://github.com/ccoenraets/wine-cellar-php and use the Slim API to get the data.) Unfortunately, mongoimport didnt seem to like this file for some reason.

So that left me with the option of a custom import script, so I hacked up the following file. It is just a quick and dirty php script to read each record (document in mongodb parlance) from the array in the wines.json file, and insert it into a collection in mongo.

[code language=”php”]
<?php

$wines = json_decode(file_get_contents(‘wines.json’));

// var_dump($wines);

$mongo = new MongoGateway();

foreach ($wines as $wine) {
$mongo->insertDocument((array) $wine);
}

exit;

Class MongoGateway {

private $conn;
private $db;

public function __construct() {
try {
// Connect to MongoDB
$this->conn = new Mongo(‘localhost’);

// connect to test database
$this->db = $this->conn->test;
}
catch ( MongoConnectionException $e )
{
// if there was an error, we catch and display the problem here
echo $e->getMessage();
die(‘error’);
}
catch ( MongoException $e )
{
echo $e->getMessage();
die(‘error’);
}
}

public function __destruct() {
// close connection to MongoDB
$this->conn->close();
}

public function insertDocument(array $document) {
try {
// a new products collection object
$collection = $this->db->products;

// insert the array
$collection->insert( $document );
echo ‘Document inserted with ID: ‘ . $document[‘_id’] . "\n";
}
catch ( MongoException $e )
{
echo $e->getMessage();
die(‘error’);
}
}
}

[/code]

Thoughts:
Perhaps I could acheived this by installing mongify twice, once to perform the export on the source (mysql) db, copying the resulting file over, and then using mongify again to import that file into the destination db.

I also discovered that using mongoimport/export also risked losing some fidelity of the native bsone types used in mongo internally, which don’t have representations in json (probably not a real risk for me, but you might want to look at mongodump/restore or db.collection.clone if this sounds like it might affect you)

I’m sure there are plenty of ways to clean it up but I only ever expected to use it to migrate a couple of schemas so didn’t spend much time on it. I’m posting it here in case anyone can use itĀ for inspiration (No warranties! Use at your own risk etc) as it served it purpose well enough for me.

It borrows heavily from the examples given at http://www.phpro.org/tutorials/Introduction-To-MongoDB-And-PHP-Tutorial.html

PHP-free websites, using varnish and golang

Posted on January 1st, 2015

Websites without php are not a new thing, but if you have been in a habit of coding sites using a LAMP stack for a while, then it can feel very strange moving back out from the familiar comfort zone of Apache and PHP.

Recently I challenged myself to do exactly that – abolish my old php habits and rewrite all my sites using go. I was itching to rewrite them anyway, to make them more RESTful as I went, so I figured why not try and put them into go at the same time?

This guide will help you create a static website from your existing files, served by a go webserver on port 3000, which is made available to visitors through a varnish caching proxy server on port 80, which passes through requests to the backend port 3000. The go webserver can also be extended to provide more complex functionality later.

Preparation – install go lang

Turns out that golang doesn’t have a binary distribution for RHEL/Centos distros –Ā and it doesn’t compile nicely (some tests dont succeed) so you have to use make.bash instead of all.bash to get a ‘go‘ binary. See this post if you want more details:Ā http://dave.cheney.net/2013/06/18/how-to-install-go-1-1-on-centos-5

Ok, so to start with, how do you setup go to run a basic, static site – composed of nothing more than html, css, and images? Luckily I had such a site, so it was a simple matter of turning off apache and mysql (they weren’t being used for anything else on this server) and then creating my new structure.

If you have used plesk or any other control panel, you may find your website docroot in a folder location such as /var/www/domainname/httpdocs

So, I created a directory (as root) /goprojects/domainname/ and in here I can create my go program. I also created a/goprojects/domainname/public/ and this is where all my html, css and images will reside.

Next I `export GOPATH=/goprojects/` and am almost doneĀ (I need toĀ use the full path to the ‘go‘ command, but later I will edit my bash profile so it is in my path or setup an alias.)

I am going to use go-martini and create a server.go file inĀ /goprojects/domainname/ as per the readme atĀ https://github.com/go-martini/martiniĀ (see step 1 of 2 below)

This is a fully fledged RESTful implementationĀ in go, but we don’t need to specify any routes now – if an asset exists in the public/ folder we created, it will simply serve it. Simples. Yes, we could use a simpler http server class, but I’m planning on using this packageĀ for the more complex sites and I like a little consistency.

For more info on building RESTful sites in go, try going through the martini docs or read walkthroughs likeĀ http://thenewstack.io/make-a-restful-json-api-go/Ā – its being the scope of this quick intro

Step 1 of 2

create the server.go as follows

package main

import "github.com/go-martini/martini"

func main() {
  m := martini.Classic()
  m.Get("/", func() string {
    return "Hello world!"
  })
  m.Run()
}

Then install the Martini package (go 1.1 or greater is required):

go get github.com/go-martini/martini

Then run your server:

go run server.go

You will now have a Martini webserver running on localhost:3000.

Step 2 of 2

Install varnish using your favourite package manager, e.g. `yum install varnish`. make sure it isn’t running yet e.g. `service varnish status|stop`

By default this sits on a high port (6081/6082) and reverse proxies to localhost port 80. This is ok for testing, but you may need to edit the /etc/sysconfig/varnish file now to change the DAEMON_OPTS, to change the 6081 and make varnish listen on port 80 (where previously apache would have been) – the admin port 6082 can be left or changed to suit your security requirements

We then change the default.vcl file and tell it to use localhost port 3000 as the backend (our go webserver), save and restart.

And you’re done!