[Edit date=2009-01-25] Please note, I mention some silly unnecessary steps in the post ahead, when I could easily set tighturl’s autoincrement to a number higher than the highest lilurl, and add some lines to tighturl to look at a different table for older (lilurl) entries. It is totally not necessary to copy over non-leading-zero ids to the tighturl table, but if this is not done, those URLs don’t get the additional data tighturl provides like hit count (unless, of course, you patch tighturl further than an ‘if less than [tighturl_floor], jump to lilurl code, else use tighturl code’, which I personally don’t think is wise, but I’m very inexperienced). Thus, while it is indeed extra work to convert valid base 36 numbers into ints in a tighturl-style table, I believe the added benefits such as hit counts and last-hit are worth it. Worth is, however, as always, determined by the user. [/Edit]
Yesterday I was bored and wanted to help Evan move ur1.ca, a url shortening service aimed primarily at identi.ca users, to TightURL. This is apparently planned, and will happen “eventually”. I loaded up the database (after messing with it quite a bit, PHPMySQL doesn’t import TSV files and saving it as a CSV was stupid as several lines include commas. In the end, I had to do a series of steps:
- Extract the database
- Rename it to be a .tsv and open it in Excel
- Ctrl+F for commas and replace them with the text COMMATIME
- Ctrl+F for semicolons and replace them with escaped semicolons (\;)
- Save to a csv on the desktop
- Open the csv in TextEdit
- Ctrl+F for commas and replace them with semicolons
- Ctrl+F for COMMATIME and replace them with commas
- Import the CSV file (don’t use the LOAD DATA thing)
I noticed pretty quickly some oddities, like entries for 0, 00, 000, 0000, etc. It didn’t click right away that these were not base36 numbers (they’re all 0!) and were in fact just alphanumeric ids. For example, this post is http://ur1.ca/0y5s. That explained the VARCHAR(255) column (what the fuck). I ended up adding the other columns tighturl used, modifying existing ones to be proper, etc, thinking I could end up converting it pretty easily. I was quite wrong. An overhaul (and perhaps a “plugin” for tighturl, despite it’s lack of a plugin architecture) would be required.
What must be done to convert a lilurl site to a tighturl site:
- Add a second table for entries that have a leading 0. I believe all other entries are valid numbers. This could be a big table.
- Write some code for TightURL (and maybe get it patched in, the author seems relatively active, and he has an e-mail address) to take shorturl ids that start with a 0 and look in the second ‘compatibility’ table for these URLs created before moving to a sane system.
- Get something that casts a fix on the column. My initial thought was to create a second column, nid (numerical id), and slowly convert each id to the nid table as an INT. I’m told this is unnecessary by people whom I believe do not know the full extent of the problem.
Of course, this could also be overcome by setting up a subdomain for LilURL and disabling the ability to create new URLs, and redirecting links that have leading 0′s there.
Evan excuses LilURL as someone’s first project, but I’m really surprised someone could think just using as-of-yet-unused alphanumeric ids was a good idea even during their first project.
I hope to add more to this post (for those who are in fact interested in doing the conversion) soon.