Archive for the 'server' Category

 

infer latitude and longitude from cell id and the FCC database

Sep 02, 2006 in coding, phones, server

The problem: Most current cell phones (except exceptionally new ones) do not come with built in GPS devices. Phones do however connect to radio towers within range, which broadcast ID’s. I am interested to determine if there is an easier way to generate lat/long location from radio towers, instead of using a bluetooth GPS device, by using inference from publically accessible data and geocoding, combined with the cell tower ID broadcast to my phone.

In many countries the latitude and longitude of cell towers is public data. In the US it is public sort of, when registering with FCC, the companies do not need to register the broadcast cell-id which is transmitted to the phones. To make matters more complex, different cell companies can broadcast different ID’s from the same towers. From this, many different projects have popped up to attempt to map cell-id to latitude and longitude. They work, I assume, to refine the results for each tower. The good news is, is that the FCC being a public entity is bound by the freedom of information act and must publish (even in a nasty form) it’s data. The FCC provides a large database of the exact location of towers, but we just don’t know what those towers are called and who uses them.

The idea here is that we can make some educated guesses in an attempt to refine the FCC data with what little information we are given by the tower itself and some simple server-side geocoding. Basically we are going hack the cell-id’s ‘into’ the FCC data, which we must assume will have missing towers and incomplete data as the requirement to register a tower only applies to certain tower types and array heights.

If you read my post about working with lat/long pairs in mysql and doing great circle searches, then you may find the following code snippets interesting as it is to work with doing computation on latitude and longitude. I worked these out by porting some and reading some math forums, they will come in handy for inferring my location in proximity of cell towers.

First we have a few support functions, converting to radians, and converting between lat/long types (Degree/minute/second format to degree’s and back):

function rad($v){
return (
$v*M_PI/180);
}

/*
degree,minute,seconds to decimal degrees
note: this could also be done in a single line using ternary
*/
function dms2deg($D,$M,$S,$dir){
if(
strpos(‘ WsSs’, $dir)>0){
return(-
1 * ($D + ($M + $S/60)/60));
}else{
return(
$D + ($M + $S/60)/60);
}
}
function
dms($rad) {
$d = abs($rad * 180 / M_PI);
$d += 1/7200; // add ½ second for rounding
$deg = floor($d);
$min = floor(($d-$deg)*60);
$sec = floor(($d-$deg-$min/60)*3600);
// add leading zeros if required
if ($deg< 100) $deg = ‘0′ + $deg;
if (
$deg< 10) $deg = ‘0′ + $deg;
if (
$min< 10) $min = ‘0′ + $min;
if (
$sec< 10) $sec = ‘0′ + $sec;
return
$deg + ‘\u00B0′ + $min + ‘\u2032′ + $sec + ‘\u2033′;
}

These are used in the following functions. First one thing we must have the ability to do is to determine the distance between two points on the map. We can use the following PHP function to do this:

/*
distance between two points using sherical law of cosines
cos c = cos a cos b + sin a sin b cos C
*/
function distance($lat1, $lon1, $lat2, $lon2, $units = ‘miles’){
$lat1 = rad($lat1); $lon1 = rad($lon1);
$lat2 = rad($lat2); $lon2 = rad($lon2);
switch (
$units){
case
“miles”: $r = 3963.1; break;;
case
“nmiles”: $r = 3443.9; break;;
case
“kilo”: $r = 6378; break;;
}
// this is another way to do it
#$rv = pi()/180;
#$a1 = $lat1 * $rv; $b1 = $lon1 * $rv;
#$a2 = $lat2 * $rv; $b2 = $lon2 * $rv;
#return (acos(cos($a1)*cos($b1)*cos($a2)*cos($b2) + cos($a1)*sin($b1)*cos($a2)*sin($b2) + sin($a1)*sin($a2)) * $r);

return acos(sin($lat1)*sin($lat2) + cos($lat1)*cos($lat2)*cos($lon2-$lon1)) * $r;
}

Next, it would be helpful to know the bearing from one point to another point.

/*
initial bearing from point 1 to point 2
*/
function bearing($lat1,$lon1, $lat2, $lon2) {
$y = sin($lon2-$lon1) * cos($lat2);
$x = cos($lat1)*sin($lat2) - sin($lat1)*cos($lat2)*cos($lon2-$lon1);
return
atan2($y, $x);
}

We also would like to be able to find the exact midpoint between two points,

/*
find out what the midpoint between two points is
*/
function midpoint($lat1, $lon1, $lat2, $lon2) {
$lat1 = rad($lat1); $lon1 = rad($lon1);
$lat2 = rad($lat2); $lon2 = rad($lon2);
$dLon = $lon2 - $lon1;
$Bx = (cos($lat2) * cos($dLon));
$By = (cos($lat2) * sin($dLon));
$lat3 = atan2( sin($lat1) + sin($lat2), sqrt( (cos($lat1)+$Bx) * (cos($lat1)+$Bx) + ($By * $By)) );
$lon3 = $lon1 + atan2($By, cos($lat1) + $Bx);

if (!$lat3 || !$lon3) return false;
return array(
$lat3 * 180 / M_PI, $lon3 * 180 / M_PI);
}

Now that we can have some functions to work lat/long pairs we need to see how to put things together. I am going to use the following image to help explain some concepts first:

In this bad image, the 1-5 is the path we take and what we can see happening at each spot of our path. the lower-case letters are cell towers that might be broadcasting to us, but we only have a row in the FCC database knowing that a tower IS there, the uppercase ones are towers we know nothing of. We we have a little app (that will not be public just yet) that run all the time and notifies us with a beep when it finds a new cell-id and we then stop where we are and type in the nearest address which the server goecodes to our location with the cell-id’s and signal strength.

POINT 1, central park: We can see cell tower “a”, we geocode our location and mark down in the database that tower “a” is within range of the geocoded location we are at, all is good, and this is about as far as anyone has gotten so far. We run a great circle search from our location and mark the towers near us with the cell-id “a” and also make a note of the signal strength.
POINT 2, west side : We pick up an unknown tower “D” we mark it down along with a geocoded location, we also pick up “a” again and mark it again, along with signal strength and provide towers within our great circle the cell-id info. This helps refine ‘a’ but we assume we are further out as the signal strength is lower.

POINT 3, hells kitchen: we now pick up another tower “b” which we can mark in the database, we also pick up “a” within this, however both have low signal strength, we mark both with the cell-id’s and signal strength. And we can figure that we are about halfway between ‘a’ and ‘b’ we can also make some guesses about direction being travelled by using our current point along with current cell and more than one reading.
POINT 4, West village: Now we only have tower “b”, we mark this tower known with strength and it is known so again we are just confirming it’s presence.

POINT 5, Fulton street: again we have a known tower ‘c’, we mark it along with the signal strength, our prior mark was with tower ‘b’ we can infer by signal strength and our prior location that we should be about halfway between ‘b’ and ‘c’

A lot of this data can be refined as we go, but as you can see, with only a few real points and a few cell towers we can gather a lot of location data, also, we can keep a live map of our locations and where we have travelled. In some areas we can get location to within a reasonable degree that we could zoom a map to street level close by.

In an area such as Manhattan with the intense cell coverage, we can quickly refine our location to a reasonable area and pinpoint which row in the FCC database cooresponds to which cell-id by just walking around for a few minutes. In more rural areas this might be a slightly more interesting problem, but if you mark your headings as you travel we could even determine your location on the map by your speed and bearing.

web page templating and design, best practices

Aug 31, 2006 in coding, general, server

Templating webpages is a tricky business with a lot of issues and options, and the vast majority of people do it totally and completely wrong. A template, in terms of documents, is any document fragment with blank spaces designed for you to ‘fill’ in. It is something that establishes a pattern. In terms of web pages, a template is the design and interface aspect of a site, it should be entirely separate from the logical portions of the site.

Let me restate that “it needs to be entirely separate from the logical portions of the site”. The primary reason for this is maintainability. The reason we use templates in the first place is to organize our development and have two distinct layers, the logic and the design. Remixing logic into design (as smarty and others do) is counter to the primary reason we chose to use templates in the first place. With a proper template system if there is a database or looping issue we know where to look right away and we don’t have to sift through the 20 templates that make up a page in order to find which one has the bug.
So, I will introduce the templatling system I wrote that forces this issue. (it will appear below) The first thing we need to do in creating a perfect template system is balance out ease of use, speed, and determine how it will function and where it will store it’s template files. It also will need ‘complete’ separation of logic and design.

A few things:

If I request index.php in the root directory of a site, the template engine will look in TPL_BASEDIR + “/index.php/” (yes, a directory of the name of the file) for the individual parts of that template. We will also have a global directory for headers, footers and the engine will look backwards up to three directories to try and find a template.
Template files will have a .tpl extension and will NOT be eval’d so that no code can be run in them and there will be no temptation for the “quick fix” of adding code to them. There will be one exception to allowing eval’d code and that will be code that is in html styled <php> tags for outputting dates or other small single variables. (this was an addition to help in transition from systems like smarty).

We want to be able to set some scheme defaults for our template, things we don’t want to bother with in the logic portion, mostly these will be color codes for css files and pages. (this helps to keep design out of the programmers hands)
We need to be able to have multiple sites using the same template engine.

We need to be able to sling sql statements and query result sets to the engine and have it use those if possible. This is a very DRY approach, similar to how rails does some things.

It needs to be able to process css and javascript files as well as html fragments.

We want a set of functions that also are design related for doing things inside the templates if we need, things similar to the “cycle()” ruby function which are automatically called, but are set and forget. This is useful when dealing with row colors so the programmer does not need to worry about colors or stye classes.

We want the template system to be fast so we build in a mechanism so that the pre-processed templates are not loaded more than once.

What we end up with is a fast and proven system, see:

The template system( txt )

To use this you would include the file and set up a template name replace.tpl which has the contents of:

“Here is the {MYVARIABLE} for sale”

in PHP you do :

$replacements[’template’] = ‘replace.tpl’;

$replacements[’myvariable’] = “house”;

template($replacements);

and you will get “Here is the house for sale”

If you want, you can also use your templates from javascript as well. I have some big plans and loads of ideas for ways to use JSON and prototype.

A very similar system was in use at my old work (this is an improvement and revamped version I wrote for another job a while back), we managed 20+ programmers of various skill levels with this template engine and it worked perfectly. Sites managing tens of millions of hits per day use this system and it works well.

We also never had to poke around in HTML to look for a programming bug.

Try it, you might like it. You can even easily turn it into a class, if you are running PHP5 turning it into a class will not have any slowing effects as it did in prior versions..
p.s. Later I will post some old code for doing in-memory content caching using shared memory (schmop) it is easily fitted into the above template system to speed it up even more.

review: dreamhost = suck

Aug 30, 2006 in general, server

I run my own server but I need to use dreamhost servers in the course of my day and in dealings with people and I would like to relay some of the pain: Most professional hosting is good but for anyone out there who are having issues with their hosting there you should consider leaving dreamhost because clearly they are great for little girls ‘first blog’ who use free webmail somewhere instead of trying to rely on the rubber bouncy balls of dreamhosts servers. If you need uptime, don’t apply, if you need stability don’t bother, need speed? no can do, they have the crappiest bandwidth I have ever experienced. Need a database, if it works, they will manage to garble it. Need email for business? no way, and if you need it for business it should not be time critical and be able to sit for days. At this point in seeing friends and people I have/and do work with dealing with the suck I am at a loss how this company can stay in business with barely being able to host plain html .. let alone anything else.

Before you go off, I worked on the team that kept datacenters running for two weeks through one of the worst natural disasters to ever happen and we did it without downtime in the middle of hurricane and through looting etc. Dreamhost on the other hand had a little heatwave and went flopping over like a rotten fish.

So, maybe that was not their fault? Well, for one of the sites for our work, this morning the site was down with mysql errors, no email from or to them for a while because dreamhost mail works like Iraqi power. And then when mail got through it came out that there was an sql query that they could not figure out how to stop so they did us the wonderful favor of RENAMING the mysql tables to stop the query from running, no email, nothing. And to top it all off, they then say “learn about mysql EXPLAIN” (because it is not their fault in any way) on a site that has been running for a long while with no issues and no complex queries. I should not even need to “explain” to someone how unacceptable this is.

Constant downtime, neverending mail troubles, geeksquad rejects as system admins and legions of people sending around coupon codes trying to save money… Do yourself a favor, protect yourself by using anyone else.

mysql latitude/longitude radius

Aug 18, 2006 in coding, server

note: this is more of a note to myself than anything
To do latitude/longitude radius from a point in mysql, first you need to have a table with latitude/longitude pairs for items to look up. You can find various databases online with geocoded data.

Now lets assume we are on a spot near Union Square in New York.
its coded location is: Latitude: 40.7383040 and Longitude: -73.99319

We assume the following distances in relation to our earth’s radius (R)
6378137 meters, 6378.137 km, 3963.191 miles, 3441.596 nautical miles
We will use these in our computation for distance from point if we want to use miles, kilo’s or meters from our starting point, if you really wanted to get crazy then 6378137 meters = 20925646.3 feet so you could literally search for something within several hundred feet of yourself.

We use these units in the following SQL to determine how we want to determine our distance from point of origin, so using R = 3963.191 to give us our distance back in miles. For this example we want to see what is 1.5 miles from this point in our database.

note:PI = 3.141592653589793, mysql’s pi() function returns 3.141593 so if you need finer grain granularity then use the above constant, latitude and longitude are the names of fields in my database in the query below


select asciiname,latitude,longitude, acos(SIN( PI()* 40.7383040 /180 )*SIN( PI()*latitude/180 )
)+(cos(PI()* 40.7383040 /180)*COS( PI()*latitude/180) *COS(PI()*longitude/180-PI()* -73.99319 /180)
)* 3963.191 AS distance
FROM allcountries
WHERE 1=1
AND 3963.191 * ACOS( (SIN(PI()* 40.7383040 /180)*SIN(PI() * latitude/180)) +
(COS(PI()* 40.7383040 /180)*cos(PI()*latitude/180)*COS(PI() * longitude/180-PI()* -73.99319 /180))
) < = 1.5
ORDER BY 3963.191 * ACOS(
(SIN(PI()* 40.7383040 /180)*SIN(PI()*latitude/180)) +
(COS(PI()* 40.7383040 /180)*cos(PI()*latitude/180)*COS(PI() * longitude/180-PI()* -73.99319 /180))
)

This can be used over the results of a sub-query as well, so if you have a huge dataset you can search for a squared area by just adding and subtracting from both longitude and latitude taking that result set and running this query over it. The query give back results from closest to furthest.

In tests over 6.2M records on my machine here, this took 21.3 seconds to complete without subquery first and 0.05 seconds after subquery over the 6.2M records. latitude and longitude being indexed.

How to setup rails on cpanel (quick and dirty)

Aug 17, 2006 in rails, server

there is no cpaddon module for rails, so you gotta get your hands dirty. This is the fast way I got it to run, I will makes notes where needed. This is current as of August 2006 with the latest “stable” cpanel WHM.

As the root user, install ruby, gems, rails, fcgi, mod_fastcgi and add a configuration line to http.conf and restart as follows ( condensed from: here):

$ cd /usr/local/src
$ wget ftp.ruby-lang.org/pub/ruby/ruby-1.8.4.tar.gz
$ tar -xvzf ruby-1.8.4.tar.gz
$ cd ruby-1.8.4
$ ./configure && make && make install

$ cd /usr/local/src
$ wget rubyforge.org/frs/download.php/5207/rubygems-0.8.11.tgz
$ tar -xvzf rubygems-0.8.11.tgz
$ cd rubygems-0.8.11
$ ruby setup.rb

$ gem install rails

$ cd /usr/local/src
$ wget fastcgi.com/dist/fcgi-2.4.0.tar.gz
$ tar -xvzf fcgi-2.4.0.tar.gz
$ cd fcgi-2.4.0
$ ./configure && make && make install

$ cd /usr/local/src
$ wget fastcgi.com/dist/mod_fastcgi-2.4.2.tar.gz
$ tar -xvzf mod_fastcgi-2.4.2.tar.gz
$ cd mod_fastcgi-2.4.2
$ /usr/local/apache/bin/apxs -o mod_fastcgi.so -c *.c
$ /usr/local/apache/bin/apxs -i -a -n fastcgi mod_fastcgi.so

$ gem install fcgi

$ mkdir -p /tmp/fcgi_ipc
$ chown nobody.nobody /tmp/fcgi_ipc -R
$ chmod 755 /tmp/fcgi_ipc -R

Then in /etc/httpd/conf/httpd.conf add

LoadModule fastcgi_module libexec/mod_fastcgi.so
<IfModule mod_fastcgi.c> 
FastCgiIpcDir /tmp/fcgi_ipc/
AddHandler fastcgi-script .fcgi
< /IfModule>

install any other gems you want like rmagick and gettext and then restart apache however you like.. Remember if you install a gem you must restart apache to be able to use it.

Now to actually get rails running, follow what I have done with this domain use your own user

$ su people
$ cd ~
$ rails test
$ cd public_html
$ ln -s ../test/public/ rails
$ cd ../test/
$ chmod -R 777 tmp/
$ cd public
$ chmod 755 dispatch.fcgi
$ vim .htaccess

chmod -R a+rwx tmp is probably better than 777, but it is an afterthought to just getting this done

Change “dispatch.cgi” to “dispatch.fcgi”

Load up http://peoplesdns.com/rails/

One of the main issues I have seen from people is that they get it running but if the tmp directory is not writable then rails pukes and gives a bunch of errors, this seems to fix the issue.

An easy way to make rails standards would be to wrap /usr/bin/rails in a shell script by renaming rails to “runrails” and then having the rails script handle this, along with setting a “_RAILS” dir in the users folder and creating all projects inside that.

if you want to use mysql (duh)
gem install mysql

if you don`t do the above, rails doesn`t panic it starts spitting out “Lost connection to MySQL server during query” errors all over which really tells you nothing.. so make sure and install the mysql gem and save yourself some headaches.

install whatever other gems you want
Then if you actually want to use your gems, you must
/etc/init.d/httpd restart
it is sort of like installing anything on windows, you gotta reboot the whole thing.

Anyway, though that might come in handy for someone.