August 2006


I have been working professionally as a PHP programmer for a long while, I recently began working for a new company and we are working all in rails.. I will be doing some recording of my progress.

Having worked previously in ASP (many moons ago) and then working in PHP for six years I first have to say that rails is “different”.
I have basically avoided frameworks as the ones in php try to turn php into something else that it is not. My general development was speedy though maintenance was not always the best on “fast” sites. For me, I have a base library of functions that I have found or written that I know I use and in general use to build up an application quickly.. I think part of my general avoidance of frameworks was that I didn`t want to use a different templating system, I have an extremely flexible one I made (and since have seen horrid templating systems in other corps, I think most people don`t understand the basics of templating at all. If you are going to create a templating system don`t even give the templates themselves the php extension, use .tpl or .tplt or whatever just to keep you from being seduced into that and running loops in them instead of reusing fragments.
Anyway, Rails provides a templating system, it provides a nice set of classes and it starts you off with excellent organization and does a lot of the work for you.
I have actually found it is easy to switch to rails for the simple fact that it is NOT using php, because I would be too tempted to start just changing things and hacking it up. I really am more of a backend programmer than anything, I prefer JSON+JS for frontend work.

There are many many things I really like in rails so far, the AJAX support is astounding, in a few lines I can do what used to be a full days work. I am in love with database migrations, and the database model system, in one fell swoop it fixes 75% of the issues you run into with php apps and data validation.

It is not all roses though:
One of the main gripes I have is that the rails docs does not have a comment system like php.net … this is a total travesty and it needs to be done, most people get more from the comments than from the documentation. I myself have contributed several hundred comments to the PHP site over the years (which comprises of the majority of all the php-ncurses docs + undocumented php-ncurses functions) comgin from PHP this totally BLOWS not being able to have this kind of resource. Sure, fxri is a nice way to find stuff, but it would really be handy to not have to search all over to find rails docs with a little programmer commentary.

Rails has some tight control over globals and it is never easy for a programmer to know the “scope” of a variable or a function. Take “request” for instance, I had to dig around for a while to finally figure out this is where the rails equivalent of the PHP $GLOBALS was, and then I found out that it was not actually “global” because I could not call it within functions in a helper. Also it was not entirely clear what the @ and : variable signifiers were from the start, this was frustrating until I sort of figured it out.

The only other issue is not really rails specific, but due to the fact that working with rails you need to have so many files open that trying to use my editor “gvim” is difficult at best, even with buffers keeping on window for each directory I end up with thirty gvims open and have to hunt for things so I switched to jedit which kind of bothers me a bit becuase over the years I have collected some useful vim scripts that I love.. one is a code columnizer (nice groupings of code all line up) and others are for comments and timestamping as well as in-file only completion (which is handy when you are working with strange files to have completion for some of the strange stuff)

Not a lot of gripes really, rails is nice and I will be writing a lot more about here saying that I am diving really deep into it and soon will be working on server-side issues a lot.

I will try to keep this up to date a bit more.

Google sketchup (free) and the pro Sketchup are fast and easy for anyone to learn, they are excellent resources to have under your belt for quickly creating some creative works.

Sketchup is an amazing program. If you don`t know what it is check out the tutorials and WATCH that first segement when you “launch the tutorial” it will go over some of the ease of use and “wow” factor involved in this program that google bought and is giving away.
One of the example images is here:


and here is one I made myself (for a game I am creating) and it took me all of about 15 minutes

A person who has never used this program can quickly model their house/apartment in probably thirty minutes (from install to finished picture, though some of the details would not be perfect, but …)

one of the things that google added was the 3d warehouse in which users can share their sketches and easily import other sketches into sketchup..

For a programmer trying to do some base graphics or even finished graphics and who does not really know a lot about computer graphics it comes HIGHLY recommended..

I will post more on the game I am developing jointly with some other people as I get some beta work done on it, but I am just so happy with sketchup I wanted to post here about it.

My no-signup, super-quick mouse movement and click-tracker has been released on the new umbrella site http://dreamvendors.com so stop by and check it out. It takes a minute to set up and get started tracking how people visit your site.

I set it up so that there is no signup necessary, each individual page that it is called from will be stored and logged.
I will post more about it later and perhaps set up the pages more, they are quick templates for that pages I set up.

36,000 mouse and click points in I discovered my initial experiment in mouse position tracking had a slight flaw in it, as I didn`t expect the script to bail so early.. So much of the tracking I would have gotten was lost, but never fear, there has been a fix thanks to the PHP function ignore_user_abort(TRUE) in the update script which under situations of a click would bail rapidly while the browser moved to a new page or loaded different content.

FIRST THING: TO VIEW THE STATS, CLICK THE LITTLE ICON IN THE LOWER LEFT. (BLUE IS CLICKS, GREEN/ish IS MOUSE)

I got the mouse track logging back on track… The script was bailing out and I fear I may have some some stats. Otherwise is working well and seems to be attracting some attention from people looking into ways to see what people are doing with their mouse on your page.

I am considering using a technique to identify the location of certain elements of a page and try to attach the div`s to them when displaying stats.. This will cure some of the ills of text-size changes, and from the vastness of screen resolutions.. Though, there is something to learn about resolutions as well and it IS kind of easy to see.. SO… I may just make multiple display modes in the overlay.

I am also looking into the mozilla canvas for the items on the overlay display which could programatically handle them very well.

I have made some changes to statsbox size on the overlay as well as colors to view what is what..

I am mostly swamped by work and life though, so it is difficult to spend too much time.. Perhaps I can steal some time at some point this weekend to get this packaged into a hosted app.

I was looking at a good mouse/click tracking solution that does visualization of where people are clicking on a site and what people mouse-over the most.. I looked at crazyegg.com and tested out their beta but didn`t like the fact their javascript overwrote every link on my page. I also didn`t like that they limited my click tracking and plan to charge for it..

(EDIT: this probably won`t look very nice or even work in IE, but IE is five years old people.. so `maybe` you should upgrade? If you are using IE your browser is probably older than your computer, think about that….)

for those that cannot see what it is supposed to look like:

Now.. it is not that I plain don`t trust so many people as I would rather control what is going on with my own stats, being a programmer, I like to keep tabs on these things. So, I did my own which is now running on this page. You can click on the little thermometer in the lower left (if you are in firefox, otherwise IE may put it in the uppper left) and the page overlay will pop up which shows you in green (progressive by mouseover) where people move their mouse around the site then, in blue where they are clicking.. This is a running `live` total and obviously limits the queries and the entries to the most relevant possible data and limits the display to 500 mouse areas and 500 click areas (progressive color by most for each spot) anything more than a thousand extra divs on a page and your browser is going to start choking, so let s keep it simple from the start and work up.

HOW IT WORKS:

  1. Register mouse moves
  2. Mark all mouse moves for the first fifteen seconds on the page via a timer (in case they don`t click)
  3. Register all clicks with prior mouse moves on the page, this is reset with each click
  4. A mouse move is regarded as a ten pixel change in the X or Y value of the mouse. All clicks are marked regardless
  5. Stats are displayed sorted by clicks DESC and then divs are created based on a normalization of the hits

TODO:

  • Fix the issue with browser size… great for fixed upper-left sites, bad for centered sites..
  • Maybe brighten up the colors some, they are a bit washed now, though, I am interested to see it after a week, so will wait..
  • cleanup the js… there is some dangles going on..
  • cache the results script, it will get out of hand with more than 100 sites otherwise
  • actually release it

Obviously this has a lot of cleanup and might turn into a hosted app for people who have PHP and such but don`t really feel like managing their own stats and just want to plop a php script and a some js in a dir and have immediate results.. We will see..
But for now, I am collecting stats so I can better tweak things..

Not bad for two afternoons work :)

I tried an experiment to see just how good the panoramic software was and trotted down to Grand Central mid-day and took a series of photo`s from the stairs.

I took a series of six photos from the stairs in grand central, I was curious if having people walking around in a photo would really mangle up the algorithms.. Guess what, it did. However, manually going in a fine-tuning the control points and adding some at key points and whalla, no problem getting the images to line up and spit out a nice panorama.

  • roogtop

If you look at the image in the lower right, there is an interesting bit with two guys who seem joined and are walking in different directions, and there is one guy with his head cut off and another guy taking his place, but otherwise I am still amazed by the stichwork these programs do on photos.

I am starting a series of panoramic shots of New York city, and I am going to start with this short article on creating panoramic images in linux..

I have a digital camera and have been too lazy and have better things to spend my money on than memory sticks, so.. I can take six photo`s with it, so I am always trying to figure out what to photograph so that I am not wasting shots, would it not be better to just have one GOOD photo than a series of so-so ones.. I started looking into panoramic images.

Now, I run gentoo linux, so clearly the imaging tools are available I just needed to research it.

There is a half-baked tutorial/infopage at the gentoo wiki on the tools to use. Though, most of those tools you really don`t need, the things you do NEED are Autopano tools, Hugin and finally enblend which will do softblending over your images so that there are no harsh lines from color shifts due to light differences and glare.

With those tools, it is pretty easy to follow this tutorial on how to use the tools. Though, I prefer to use enblend internally from Hugin, so don`t follow that tutorial to the letter.

I had taken some photos from a rooftop here in Manhattan, and sat down to work on this. The following image took me about ten minutes, using tools I have never touched, to make the following.

  • roogtop


from camera to this image output as a jpeg in ten minutes!

Six images, quick fast and turned into a panorama..
There is one smudgy spot, which is just because I messed up the photo, but otherwise, going from images to panoramic that fast is awesome.

I started a photo gallery here on the right and will try to add panoramics to them here and there as I take my camera out. I doubt my callphone camera takes decent enough photo`s for doing this, but I will try that also.

I get lot of people asking me about AJAX and what library to use, so I will write up here my personal preferences on what to use and why.

First and foremost THE library to use is Prototype, I use a light version of prototype that comes with moo.fx lib.

Prototype is the `core` not the end means for doing ajax calls. For easy easy ajax work you need to look at moo.fx, the prototype lite and moo.ajax can be downloaded easily from here. The reason for using moo.fx is that it is a super small library which gets the job done.

AJAX calls in moo.ajax are so easy it is ridiculous and with prototype $(`ID`).property style work you can just plop whatever into whatever easy as pie.

For “full-featured” libraries for doing effects, I would suggest Rico which has nice smooth animations and scrolling and it`s drag and drop lib is really nice. Though, the reality is you will seldom ever really need this functionality so use sparingly.

The other popular lib is script.aculo.us which I personally like the list sorting portion of, but the library is big and chunky and in general I feel it is just over the top and moo.fx is generally a better choice.

Using small and tight javascript libraries is kind to your users, makes your page feel snappy (vrs that chunky web2.0 feel on sites that include 30 js libs because they just HAVE to have drag and drop on things) and it is also easier on you because you don`t need to read a book to use them. I was up and working with moo.ajax and moo.fx in no time, and any questions I had could be easily solved by either looking over the source (which is 3kb) or poking around in the documentation they have on their site.

note this entry is way out of date and the code is mostly gone

just an interesting tidbit

I like interesting ideas and interesting effects with js and transparency in particular.
Try the following link..
click here

You will notice that a interesting corner box appears in the upper right.. I am experimenting with this and having it being called via ajax, so that you can be sitting there on a page and as an event comes in that needs your attention, you can pop up this cornerbox which will not bother the user but they will be aware of what needs to be done and are offered a link to check it out or to dispell the effect.

I tried to get this to work with IE, I did.. but, the png fixing script cannot seem to fix the transparency in a hidden cell..

Funny how all the “fixes” are for IE now..

Short paper on large scale data indexing for retrieval.

When dealing with large amount of data, at times it is necessary to index outside of your current database, particularly when space is an issue.

The new MySQL archive storage engine is perfect this scenario, however you still need to index the data as the archive engine is compressed and does not support indexes (and rightfully so).

There are several camps of thought in this arena:

with a hashed index, you need to deal with very specific queries and the hash tables will take up large amounts of space.

with a bucketed system the data is highly organized, but space will still be a major factor as will be retreival and again you will need to store an additional database that lists available data. Each piece of an index is broken down. You set up directories of the data like 0-9_a-z_A-Z (62 directories) and in each create an additional set of 0-9_a-z_A-Z, thus when retrieving the data for record number “DJ192″ you would find that item in /hashdirs/D/J/DJ192 (very simply put this is a backet system, there are obviously much more advanced versions of bucketing, even within databases)

Bloom filters were designed (quite a while back) for a different usage, but have proven to be fast, small and can be tweaked for reliability in this area.

A bloom filter is a “ONE-WAY” hash, meaning that you cannot derive the type of data stored in a filter, you can only check for the existance of a piece of data. Traditional bloom filters are used in spellchecking applications, RFID identification for stores and anywhere that you may have enourmous numbers of distinct chunks of data.

De Gan filters is my improvement on bloom filters which has been able to store enourmous amounts of data in a hashed thread-table. It is not perfect and needs to have some tweaks made on it, but for one-way hashing tables in a scripting language, it is well formed and fast (even without bitshifting implemented)

Well, that is great but how is this useful?

First, you will never search your indexes on items that are not in the database, second you can greatly speed up your searches by using this filter and third when you create the record in the filter, you can also store a combine hash (md5 over the sha1/md5 combo) and use that to reference your data. In the case of a text article, this is useful for word pairs/pairing hashing so that you can easily search them and know that your data retreival will not be wasting any milliseconds checking for data that is non-existent or actually combing through the data.

For small(ish) sets mysql fulltext is fine, I have had no problems or cpu issues on a database with over 5,000,000 records running fulltext searches over multiple fields. However, in the case where you go to 50M, 100M or 500M (depending on your hardware) you are going to see some drastic reductions in performance of searches, especially if the database is being heavily used.

With the filters I developed, it is entirely possible to fit 100M records in a one-way hash in about 20-25megs of space, in a mysql database, this would equate to the records and the indexes at most likely 200meg to 500m depending obviously on what is in the table, though for even an indexed lookup it will be much larger than 25meg for a hash. Additionally, you will need to store alongside the table a further index of the hashes, the best way to do this is bucketing the databases and tables (database A-Z etc and tables A-Z) which will be large, but will be much faster than a straight single table.

does this work?

Ask google, the founders original intent and design was published for all to view and study. It is like a roadmap on how to build a monster ever-scalable search engine

edit also found the google spasehash project that google released.

« Previous PageNext Page »