Wednesday, August 29, 2007

Fulltext search adventures with Postgresql

So yeah, we've been using Postgresql on Funadvice just after we made the big switch to Ruby on Rails more than a year ago now. We're still a small site coming up quickly to a million uniques per month and we're holding up pretty well.

I know tons of sites use MySQL successfully, but for us, there was some data corruption, mysterious MySQL load flareups and some slowdowns in some key queries. The switch to Postgresql did it for us, and to be honest i have not had to think about the database in the past year.

Now we've tried different kinds of search engines over the past year too. At first, ferret worked well with acts_as_ferret, but the index just kept crashing and throwing some nasty errors. Then we went to swish-e which gave us perfect results, but keeping the index rebuilt daily was beginning to become a massive chore (the index has to be rebuilt completely each time).

Finally we tried Tsearch2 which had stemming out of the box and truly remarkable speed over a full text corpus of over 200,000 records.

Now I could do stuff like this:

tquery = 'photos of emo'
@photos = Photo.find(:all, :conditions => ['fti @@ to_tsquery(?)', tquery.gsub(/[ ]+/, '').strip ], :order => 'id DESC', :limit => 10)


And it wraps great into native pagination and seamlessly into Active Record.

No doubt it takes a little setting up the database fields and so forth first, and these addon Postgresql types are not going to fit into migrations either. But this works for us, and so far seems to be nicely scaleable with what we have.

Give Tsearch a whirl if you're already using Postgresql, or even if you're not. You'll be glad you did.

Wednesday, August 22, 2007

Five days into the new design

It's been only five days since our new design, but we've learned a few things:

1) People take time to learn the layout of a site (we're sorry for such a massive change, with little advance warning)
2) 99.9% of the comments have been positive (thank you! that means a lot to us)
3) No matter how much time you allocate to doing something, it'll always take longer.

We have some really cool features that we wanted to include but didn't make it due to our scheduled launch date.

So, the design is going to stay as is for a while. BUT, we'll be adding more features over the next few months that I think will continue to improve on the "fun" in FunAdvice.

Last...did you see our press release? Check it out, it tells more of our story than we've ever shared in one place before.

Friday, August 17, 2007

Massive upgrade in the works...back in 20 min

We have our biggest upgrade in a year launching now...we'll be back in 20 minutes or less. Thanks!

Thursday, August 9, 2007

Monthly QuantCast Stats checkin: still growing the fastest

Every month now, I've been looking at these numbers:
quantcast funadvice stats. While they are light by a almost 50%...they do show our trend spot on. It's like the data they license is nearly 50% of the available data that we have from Google Analytics. Interesting.

Now, comparing on quantcast yedda, answerbag, and us. See who's growing the fastest? Yep, so do we :).

Yahoo Answers, the clear winner, is also flat over this time period...which, honestly, I'm not suprised at. The signal to noise ratio is extremely low, and their large community size makes the sheer scale of the thing difficult for new users to adopt, understand, and embrace.

How do we stack up versus Amazon & MSN? Again, we're winning. Not only are we bigger, but we're growing faster, too.

That's my take on the Q&A market. Are you listening, Mike Arrington?. No funding, part time employees, and we're taking over this category. Still, no techcrunch write ups.

What about you, Om Malik?. We talked about six months ago, and we're nearly 3x the size we were when I did a phone interview. Still, no write up. Oh, well.

The social networking guru himself, Pete Cashmore. Obligatorily, you covered Yedda, AnswerBag, etc...however, not one mention of us. ;)

And, my hommie at Startup Squad. We exchanged email, but, I'm sad, still no write up :(

Now, what does my pointing out the growth stats, and the teir one blogging public have in connection with FunAdvice?

Easy: To build a big, huge, and fast growing site, you can forget (to some degree) about the a list bloggers. They aren't the audience for FunAdvice, and frankly, more than likley never will be. However, it'd be interesting if they *did* write, simply because, like it or not, by the end of the Year, FunAdvice will be the 2nd largest in the category.

(shhh...bold prediction). By the end of 2008, we'll be the largest. We're already the best ;)

Thursday, August 2, 2007

Going green, how do we start?

Getting more environmentally friendly has been on my mind for a long time now. Just this morning, I read about Discovery Channel buying a pro green blog site, and that reminded me: going green isn't just about keeping the earth sustainable, it's also good business these days.

However, while my inclination is to "go green & get eco friendly" I must confess that we're not entirely sure what we should do to start.

If you have any ideas, let us know by commenting or posting some questions on the site proper. Thanks!

Wednesday, August 1, 2007

SPAM from some dork

Slightly edited...read below & note the following:
1) they are spamming blogs who mention Yedda (are they spamming blogs who mention us, too?)
2) The fuckhead didn't personalize his writing
3) We don't allow swearing on funadvice.com...but, when I'm blogging & some ASSHOLE spams me, you better believe I'm going to call him an idiot.

Not to mention, an asshole. Kevin, kindly, fuck right off ;)

My name is Kevin Carey and based on your coverage of Yedda and other Q&A sites, I thought you might be interested in covering a new website I have developed.

QueryCAT () is a new search engine that has indexed FAQs from all over the web and made them searchable from a single site. Its an interesting take on the "vertical search engine" in that it is not topic specific, but instead searches a specific "format" of web page.

We have indexed over 4 million questions by crawling the web and applying our own unique question recognition technology to power the search.

We believe that a comprehensive FAQ search will be a very valuable tool for many people.

In addition to our own site, querycat.com, we plan to offer the FAQ database and question recognition technology to other companies that want to add FAQ search to their own site.

If you have any questions about the site, please let me know. I am interested to hear what you think.

Kevin Carey
Founder, QueryCAT
webmaster@querycat.com
Technorati tags: , , , . :)