August 23rd, 2009
Oh that we could all be a little more forgiving like Luis below.

About a year ago I posted about my hopes for android and it helping me write CrumbTracker. Been about a month ago that I got my MyTouch, a google phone running android. I have kind of caught the vision of app stores and the like. Played a fair amount of a pretty killer tower defense game, Robo Defense, which I even bought after getting hooked on the trial.
This past week, in my copious spare time, I dove in and worked on my android app CrumbTracker, that ties into my first real java appengine work at the web site. The vision is that the phone phones home to an appengine web site. If you have an android powered phone, give it a go. Last night I got it so the uploaded crumbs show up on a google maps map. Think I will write a series of blog posts about my experience getting it all up and running. Non-trivial, but hey I have only worked on it a few nights, so that’s pretty cool.
Here are some posts I am thinking about writing
- switching a domain from google apps to appengine
- SQLite on android
- the datastore on java-powered app engine
- the emulator
- doing an http post from java / android
- recognizing you’re in the android emulator from java
- sharing code (or not) across android / appengine projects
- ui in android
- android location management
- google maps initiation
- svn hiccups
Wow, that sounds VERY interesting! And I hope it is.
Enjoy!
Earl
Posted in android, appengine, java, sqlite | No Comments »
August 7th, 2009
Maybe I should wait to write three of them, but if I worked for the onion, I think I would submit the following for consideration
- Study finds global warming caused by computers studying global warming.
- Two NFL athletes move into training camp facility to fight for “first one to practice, last one to leave” title.
What do you think? Should I send them my resume? Maybe I’ll wait till I get a third one.
Enjoy!
Earl
Posted in Uncategorized | No Comments »
July 21st, 2009
So, I have a couple famous friends and one of them asked if I could write a little app that could help him send emails to folks on his totally legit email list. Well, since it is a totally legit email list, folks that really do want to hear from him, I felt pretty comfortable writing the app on google appengine. About a day after he asked for a little help, google released an offline taskqueue that could be used nicely for sending emails. Rather serendipitous, I thought.
Lesson in non-engineer customers: sometimes they want you to do things that you think you can’t do, but with a little effort you can. My friend said he would like to just maintain spreadsheets in google docs for who to send to, then have my program look at the spreadsheets and send accordingly. I was like, “yeah, I don’t think we can do that.” We started to discuss how to maintain the email list just with my app, how to keep things in sync, how to query for users in a certain country or state, etc. Messy. Well, turns out using the gdata api, I can authenticate a user, talk to google docs, allow the user to pick a spreadsheet / worksheet, then pull stuff like email addresses. Awesome!
Took a few nights, but I have something up and running. My friend says he will give me some nice powered by links when he sends and I am hoping for a blog post here or there.
If you happen to have a totally legit email list (very serious about that part) and would like to trade some sending help for some publicity / marketing, please drop a line to cahille AT yahoo DOT com.
Enjoy!
Earl
Posted in Uncategorized | No Comments »
June 30th, 2009
For ages I have been meaning to add some sort of search to mycomparer. Well, we’re live! I spent may four hours total on it over two nights, and implemented the following features
- walk through each word in the query and search categories and += matching categories
- walk through each word in the query and check against upcs
- walk through each word in the query and check against affiliate ids, like searching by asin
- good old full text search via mysql
I did the first three the first night, and started the full text search. Here’s what I had to do.
- Create a table
- CREATE TABLE sh_product_my_text (
product_id INT NOT NULL,
FOREIGN KEY (product_id) REFERENCES sh_product(id) ON DELETE CASCADE,
my_text TEXT NOT NULL,
FULLTEXT(my_text),
timestamp TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP
) ENGINE=’MYISAM’;
- then in my shopping db population process, I populated the table with some product stuff
- I change a query like “computers netbook -wireless” to “+computers +netbook -wireless”
- the above ends up in ? for
- SELECT product_id FROM sh_product_my_text WHERE MATCH(my_text) AGAINST(? IN BOOLEAN MODE) limit 20;
- also added–key_buffer_size=1024M to my mysql config. pretty terrible before this change, pretty good after
Course, I implemented it as a service and tie into the service via my Template::Plugin::WebService with code that looks like
[% USE web_service = WebService %]
[% search_ref = web_service.webservice_call('/api/shop/search', form) %]
Can’t tell you how cool I think that is. If I decide to serve straight from flex or something, then it is pretty well no code change.
And that’s about it. Give it a go.
Enjoy!
Earl
Posted in Uncategorized | No Comments »
June 18th, 2009
Awhile ago I wrote some (I think) cool stuff for pig that allowed for parsing apache logs. Unfortunately I wrote my stuff on an old branch. Didn’t really know it was an old branch and that everything I wrote would need to get ported, but there you go. Recently, someone ported my stuff (which was awesome!), and folks at cloudera are blogging about it.
Years ago, I wrote this (I thought) cool stuff, Data::Fallback, which would allow you to pull data from various sources. I don’t think anyone in the world ever used it. Like ever. In fact I discovered memcached and I quit using it. Kind of cool that folks might actually use some stuff I wrote.
Earl
Posted in Uncategorized | No Comments »
June 18th, 2009
One of my “many” hosting customers was mentioning that when he logs into his site, he sometimes sees an error. Well, it turns out I would sometimes see that error and he mentioning it inspired me to look into it. It comes down to speed. I have my admin stuff hosting on google app engine and it talks to my backend via web service. Turns out that google doesn’t want to host slow serving pages, meaning pages that take more than like five seconds to load. And it turns out that my web service would sometimes take more than five seconds. There were a couple issues.
- Memcached helps me not hit the database. I used to have servers at 10.1.1.1 and 10.1.1.2. A little while ago I quit running the 10.1.1.2 server, but was still checking it in the code. Think I would hit some timeout which wasn’t too long, but it slowed me down enough to annoy google.
- Memcached is all about what the memkey is. You look up values based on a memkey. Well, I call it memkey anyway. For me, if the memkey doesn’t return something, I hit the database and then add the memkey. Well, in my code for getting a user’s configuration, I had hard coded $memkey = time, which means that each time the code ran, I would fail to get the conf. I guess that someday in the past I wanted to generate the conf each time, and then just happened to commit. Oops.
- Added an index or two to mysql, but don’t think that helped too much.
I am afraid that folks would try and login, get an error and give up. For sure they wouldn’t likely tell their friend to come sign up for a site.
Enjoy!
Earl
Posted in Uncategorized | No Comments »
June 13th, 2009
Let’s supposing that you have log files of some sort pouring in and you want to put aggregate data representing the logs into an rdmbs. To begin, let’s start with a blank slate, i.e., just dumping the data in. And let’s have a simple table, that in mysql is created via
CREATE TABLE `history` (
`id` int(11) NOT NULL auto_increment,
`hits` int(11) NOT NULL,
PRIMARY KEY (`id`)
);
I did a pass each for both MyISAM and Innodb with a million inserts.
engine
|
queries per insert
|
seconds (lower is better)
|
| MyISAM |
10000
|
7.046952963
|
| MyISAM |
1000
|
7.342753172
|
| MyISAM |
100
|
8.521313906
|
| MyISAM |
10
|
31.44731498
|
| MyISAM |
1
|
135.3045712
|
| MyISAM |
load data infile |
4.927606106
|
| Innodb |
10000
|
19.76374817
|
| Innodb |
1000
|
30.58060002
|
| Innodb |
100
|
89.54839206
|
| Innodb |
10
|
723.135994
|
| Innodb |
load data infile |
17.25715899
|
A multi-value insert for three values looks like this
INSERT INTO today (hits) VALUES (?), (?), (?)
Then I execute with the three values.
The fact that inserts with 1000 values start to approach the load data infile numbers is a little compelling. But let’s suppose that we want to do every insert from a bulk load but we want to have a table (like history above) that has aggregate data, += style. Is it possible? Sure.
Here is one approach for mysql:
- Create a temp table, which I will call today
- Bulk load the data into today
- Run the query INSERT INTO history (SELECT * FROM today) ON DUPLICATE KEY UPDATE history.hits = history.hits + today.hits;
- Drop today
I would like to apply this strategy and contribute some pig code that allows for bulk insert. This would (I think) allow for some pretty large scale aggregating all from with a “simple” pig script. Would also like to start using chukwa, but it looks a little tough. I think the architecture would then look something like
web servers -> chukwa -> pig -> mysql
Think then I would be pretty well at yahoo! or facebook scale.
Guess we’ll see how it all goes
Enjoy!
Earl
Posted in Uncategorized | No Comments »
April 15th, 2009
I was pretty excited about getting sitemaps working, so much so that I recently wrote about it. Turns out I had a couple bugs in my implementation. When I was on diamondcomparer.com, I would do something like show all the categories / products that diamondcomparer didn’t actually offer. Also turns out that pretty well each shopping site had more than 50,000 urls, which means I had to break things up a bit. Plus, I wasn’t zipping things, and I wasn’t real confident I was doing everything right. So, I decided to use google’s open source code for generating sitemaps, which I figured handled everything I was looking for.
In the past I had used the google code for crawling directories, but now I needed to pull from a database to my list of urls. Well, turns out the google code can handle that as well. You just dump the urls to a file, make a config file explaining a few things and then away you go. Was really not too bad. Stayed up till three am last night getting this to work

While I am here, have you seen chrome’s xml viewer? Yeah, me neither, it just dumps to the screen.
I have been tracking google (and others) crawling my stuff, and it looks like the product pages haven’t been getting crawled. I am hoping this helps that out. Guess we shall see. I am now generating these files and pinging the search engines nightly. Really would like to get traffic based on product pages being indexed well.
Enjoy!
Earl
Posted in Uncategorized | No Comments »
April 14th, 2009
A couple main parts to my shopping vision.
- Help users filter down to just the product(s) that they are looking for
- Find the best prices on said products
Well, been hoping for some progress on the second front for a good few years to no avail. For whatever reason, this last week I integrated products (not a ton but a few) from newegg, bestbuy and buy. This allows for actual price comparison, like so from http://mycomparer.com/ap/B00005ATMK/yo
pretty dang cool, I think.
The first one took a fair amount of work, but the next couple were a bit easier. Looks like overstock.com has a data feed. I would like to integrate with them and pretty well anyone that pays for conversions and offers a csv or the like.
I bit the bullet and switched everything that was going to trackings.com, shopthar.com, shop.spack.net, yohomes.com (and maybe a few others), and pointed it to http://mycomparer.com/. Guess we’ll see how it goes.
Enjoy!
Earl
Posted in Uncategorized | No Comments »
March 17th, 2009
Recently I cleaned up some shopping stuff so that if I got a single slider (that’s what I call the number picker guys) that was kind of empty, I would kill it. I knew that it wasn’t really in general, and figured that I would someday have to clean it up. Well, tonight was the night. Stuff that looked like this

now doesn’t have that last chunk. Not real sure where it comes from, but at least now I can clean it up
Also cleaned up something a little more subtle. My goal is to have a pretty generic shopping engine, which doesn’t know the difference between a hard drive form factor and the clarity of a diamond. That’s fine, except that also means it doesn’t know that IsLabCreated may not be the most meaningful, or weird contract warranty terms, or whatever. Tonight I added the ability to ignore a category of stuff. I even made it smart so that I walk the lineage of a category and look for ignore lists. The cool part there is that I didn’t need to ignore IsLabCreated for each type of diamond ring. So now, this

has the weird stuff stripped out. Granted I need to do a database insert each time I find something else to ignore, but that’s ok. I don’t have that many top level categories to manage (for now).
Next up, I would like to specify the order for distinctables to appear. Like color and clarity before number of stones, or the like.
Tonight it hit me that once I get a little further along, I can go put up some “looking for a friend in the diamond comparing business?” flyers at byu, since I think a few folks down there are looking for rings
Anyone have any thoughts on mycomparer.com? I registered it today. Gotta be better than shopthar, right? Still liking my diamondcomparer.com.
Enjoy!
Earl
Posted in Uncategorized | No Comments »