Read my latest article: 8 things I look for in a Ruby on Rails app (posted Thu, 06 Jul 2017 16:59:00 GMT)

Installing untrusted PL/Ruby for PostgreSQL

Posted by Mon, 22 Aug 2005 12:55:00 GMT

This is going to be short and sweet.

“PL/Ruby is a loadable procedural language for the Postgres database system that enable the Ruby language to create functions and trigger procedures”

Method 1. The standard, safe, PL/Ruby.

Before running this, you need to have all the PostgreSQL headers installed. (se INSTALL in the postgresql directory) make install-all-headers

To install PL/Ruby, you need to download the tarball from here. As you can see, I download it with wget and then install like I would any ruby library. (maybe plruby could become a gem?)

cd /usr/local/src
wget ftp://moulon.inra.fr/pub/ruby/plruby.tar.gz
tar zxvf plruby.tar.gz
cd plruby
ruby extconf.rb
make
make install

Method 2: The untrusted, but super cool PL/Ruby.

Guy Decoux, author of PL/Ruby, was kind enough to share a secret about the PL/Ruby install. (from his email…)

Well plruby normally run with $SAFE = 12, this value if fixed at compile time. 

Now it has an undocumented option, if you compile it with

ruby extconf.rb --with-safe-level=0 ...

 it will run with $SAFE = 0 and you have the equivalent of an untrusted language.

Pretty simple solution, eh?

On my server I was able to run the following:

cd /usr/local/src
wget ftp://moulon.inra.fr/pub/ruby/plruby.tar.gz
tar zxvf plruby.tar.gz
cd plruby
sudo ruby extconf.rb \ --with-pgsql-dir=/usr/local/pgsql-8.0 \ --with-safe-level=0 \ --with-suffix=u
make
make install

Update: the --with-suffix=u was added after someone commented on this. This allows you to install plruby and plrubyu.

Installing PL/Ruby in PostgreSQL Up until now, you haven’t actually installed the language into the database. We’re close though!

All that you need to do is run the following commands to install it to a specific database in your server.

$ psql template1
template1=# CREATE DATABASE plruby;
CREATE DATABASE
template1=# \c plruby
You are now connected to database "plruby".
plruby=#    create function plruby_call_handler() returns language_handler
plruby-#    as '/usr/lib/site_ruby/1.8/i386-linux/plruby.so'
plruby-#    language 'C';
CREATE FUNCTION
plruby=#    create  language 'plruby'
plruby-#    handler plruby_call_handler
plruby-#    lancompiler 'PL/Ruby';
CREATE LANGUAGE
plruby=#

That should be all there is to it!

Where do we go from here?

See my post: PL/Ruby loves RubyGems and DRb

When TSearch2 Met AJAX

Posted by Mon, 22 Aug 2005 01:48:00 GMT

Last night, a local PDX.rb-ist, asked about full text searching in PostgreSQL. I pointed him to TSearch2, which is a nice little addon to handle full text searching with indexing, ranking, highlighting, etc. To my knowledge, it’s the closest to a google-like search that you can get with PostgreSQL. Some people in #postgresql (irc.freenode.net), said that you can build custom functions that will allow you to quote content, and do other fun stuff within your search string. We can discuss that another time.

After thinking it over, I thought, “why not put ajax on top of a full text search and see what it can do?”

The first question, where was I going to get a bunch of content that I could search through and have it be somewhat meaningful for the public, if I decide to put it up as a demo page. The RubyOnRails mailing list came to mind, so after seeing that I couldn’t download the full archive from the rails mailman page (at least not that I could tell), I decided that I would just import my Maildir for that mailing list.

This added another initial step. What would be a good way to import the 13,000~ emails that I had in the folder?

I knew that worst case, I could find a module on CPAN and build a perl script to import it… since I didn’t see anything in the standard ruby library. Then I found TMail. Someone said that they think ActionMailer uses TMail as well.

The resulting quick and dirty script became:

#!/usr/bin/env ruby

require 'tmail'
require 'rubygems'
require 'postgres'
require 'dbi'

conn = DBI.connect("DBI:Pg:database=rails_mailinglist;host=localhost;port=5403", "username", "password" )

MAILBOX = ".MailingLists.Ruby.RubyOnRails"

sql = "INSERT INTO archives (sender, recipient, subject, body) VALUES (?,?,?,?)"

@sth = conn.prepare(sql)

box = TMail::Maildir.new(MAILBOX)

box.each do |port|
        mail = TMail::Mail.new(port)
        p mail.subject
        @sth.execute(mail.from, mail.to, mail.subject, mail.body)
end

exit

Not rocket science. :-)

Okay, so I let that start running through the mailing list emails that I have, and opened up another tab in iTerm and typed our friend, rails archives followed by cd archives. The next step was to modify the config/database.yml file.

(you all know how to do that, right?)

Okay, you should still be with me…so far.

After I got my database settings in place, I ran ./script/generate scaffold Archive and watched it created my new filles to play with.

./script/server and I am looking at the first several emails that are in my RubyOnRails mailing list folder. I notice that the first one is the confirmation email from the day that I signed up on the mailing list. Mon, 24 Jan 2005 16:00:14 +0000 (GMT) . So, I delete that email and the ‘welcome to..’ one so that no one sees my mailman password/confirm info. ;-)

Installation

So, Rails has no problem with the data. So, I then head over to the Tsearch2 site and look for some installation information. I walked through this walkthrough

Database Structure

For this example, I kept it pretty simple for the database structure. I believe the create script was:

CREATE TABLE archives (
  id SERIAL PRIMARY KEY,
  sender VARCHAR(255),
  recipient VARCHAR(255),
  subject VARCHAR(255),
  body TEXT
);

The rest was basically following through with those steps and building the triggers and functions around the subject and body fields in the table.

To use the tsearch2 functionality, I used find_by_sql rather than using just find.

@archives = Archive.find_by_sql("SELECT id, headline(body,q) as headline, body, rank(idxfti,q) as rank, sender, subject  FROM archives, to_tsquery('#{@str}') AS q WHERE idxfti @@ q ORDER BY rank(idxfti,q) DESC LIMIT 100") 

The @str variable is a value that I build based on the string(s) that the user is typing in the search field. Tsearch2 requires that you sepeare each string with a pipe (|). So, I put in a few checks on the string that was being passed to my method in my controller by AJAX. (I’ll let you take the time to figure out how to get AJAX in Rails working and watching a text field… it’s not hard to find info on google. ) :-)

The end result?

I will warn you that this does’t work in all browsers, some IE people said they had issues… and I spent enough time tinkering with it to just settle with this for now. :-)

I present… fulltext searching with PostgreSQL on Rails.

There are approx 13,000 emails in the system, so I put a limit on the number of responses that show up to 100.

My Thoughts

Well, it was an interesting concept. I’m not a big fan of livesearching, it doesn’t really seem to buy us much when working with this sort of data. I do find live auto-completion to be quite useful though. It’s not practical to have AJAX peg the database every second as I type for new content and it’s obvious that a database with that much content is not going to respond as snappy as you would hope. However, I decided to compare the speed to searching in Thunderbird and Evolution. From my sophesticated benchmarking suite (my imaginary stop watch)...

AJAX won!

okay, I should be fair and say, Tsearch2 won as it is doing all the heavy lifting.

Enjoy!

PostgreSQL sequences in Rails

Posted by Sat, 20 Aug 2005 06:05:00 GMT

Rails doesn’t support legacy or custom named sequences at the moment. (as far as I am aware). It’s kind of tricky to have it detect the SEQUENCE name automatically (every time).

In PHP, I used this big ugly query to detect this info:
$sql = $db->prepare("SELECT seq.relname::text
                        FROM pg_class src, pg_class seq, pg_namespace, pg_attribute,
                        pg_depend
                        WHERE
                            pg_depend.refobjsubid = pg_attribute.attnum AND
                            pg_depend.refobjid = src.oid AND
                            seq.oid = pg_depend.objid AND
                            src.relnamespace = pg_namespace.oid AND
                            pg_attribute.attrelid = src.oid AND
                            pg_namespace.nspname = ? AND
                            src.relname = ? AND
                            pg_attribute.attname = ?");

I used this to mimmick the mysql_insert_id function in PHP for PostgreSQL… ( pg_insert_id )

Well, with Rails, I thought that I would build a similar patch, as the current code just assumes the value would be {column}_id_seq.

After hours of playing around and thinking that I figured it all out ... I decided to run a quick test with a non standard sequence name… like this one:

testingdb=# \d legacy.foobar
                                     Table "legacy.foobar" 
  Column   |         Type          |                         Modifiers                          
-----------+-----------------------+------------------------------------------------------------
 foobar_id | integer               | not null default nextval('legacy.old_sequence_name'::text)
 name      | character varying(40) | 
Indexes:
    "foobar_pkey" PRIMARY KEY, btree (foobar_id)

testingdb=# INSERT INTO legacy.foobar (name) VALUES ('abc')
testingdb-# ;
INSERT 17514 1
testingdb=# SELECT * FROM legacy.foobar ;
 foobar_id | name 
-----------+------
       106 | abc
(1 row)

My patch wouldn’t figure that out because the sequence was not created by SERIAL. So, my patch started to feel lame and a total waste of time, because I thought that it was fixing a problem.. that works pretty much as effectively as assuming it is _seq.. but without needing to run a SQL query to determine that. We all (should) know that the field will be named like that when working with SERIAL. So, my patch didn’t buy us anything.

However, Active Record still doesn’t support those funky sequence names. So, I found this ticket #1273.

Their approach was very similar to what caused me to use my long SQL query in the first place because this was suggested to me well over a year ago and I found it to not work in the following situation.

If I have two seperate schemas with the same table name in each like so:

=# \d legacy.people
                                       Table "legacy.people" 
  Column   |         Type          |                           Modifiers                           
-----------+-----------------------+---------------------------------------------------------------
 people_id | integer               | not null default nextval('legacy.people_people_id_seq'::text)
 name      | character varying(50) | 
Indexes:
    "people_pkey" PRIMARY KEY, btree (people_id)

=# \d foo.people
                                       Table "foo.people" 
  Column   |         Type          |                         Modifiers                          
-----------+-----------------------+------------------------------------------------------------
 people_id | integer               | not null default nextval('foo.people_people_id_seq'::text)
 name      | character varying(50) | 
Indexes:
    "people_pkey" PRIMARY KEY, btree (people_id)      

That patch will not work because you can’t call the following query:

# SELECT adsrc FROM pg_attrdef WHERE adrelid = (SELECT oid FROM pg_class WHERE relname = 'people');
ERROR:  more than one row returned by a subquery used as an expression  

... because there are two tables with the same name! (fun, huh?)

Mine would work… but why bother with that huge query? So, I took my ticket out of [PATCH] and decided that I thought it would be best to just assume that sequences are generated with SERIAL ( link ) by default in AR.

Okay, so what can we do about custom SEQUENCE names?

Well, I am proposing the following (and mentioned this in the ticket #2016)...

class LegacyTable < ActiveRecord::Base
  def self.table_name() "legacy.foobar" end

  # new option for this
  set_primary_key "foobar_id", :sequence => "legacy.old_sequence_name" 
end

(or something along those lines)

With this, I can work around these legacy database scenarios with a quick option. Thoughts/opinions?

I decided to post this on my blog as well, because I do know that there are a few skeptical PostgreSQL people out there who read my blog… I want you to know that I am looking out for you. ;-)

I am sick and tired.. and going to sleep now.

When getting sick sucks the most

Posted by Fri, 19 Aug 2005 13:21:00 GMT

As I mentioned earlier this week. I’ve been sick. it’s totally screwed up the past few weeks of productivity, trying to fight off a cold while meeting some deadlines. :-(

The worst part of the whole thing. I realized the other day… that I had to cancel my roadtrip to FooCamp. Yes! I can’t make it. I made the decision on Wednesday as I would have start driving yesterday. I could have gone… but when I saw that my girlfriend got sick from being near me, I didn’t want to expose the all the geek superstars to whatever I got while I was in Southern California almost 2 weeks ago. I’m still fighting it off and… and get to spend my weekend, not hanging out with geeks. Le sigh.

I don’t want to be known as “that guy who got everyone sick!” Heh.

Have fun down there, Lucas!

Are you a console master?

Posted by Fri, 19 Aug 2005 02:06:00 GMT

1 comment Latest by Lance Wed, 02 Aug 2006 10:58:30 GMT

I have a few questions.

1.) Do you know what ./script/console does?

2.) If not, why not?

3.) If so, do you have any fun tips and tricks to share with the masses?

It occured to me earlier that many people, who might have came from the PHP camp, may have never really tested their object-oriented code from some sort of interactive program. (irb) If you are coming from the Python, Java, etc worlds, interactive testing isn’t anything new. Rails is nice enough to bundle a console script right within it!

I meet people online who have never even tried to run it. There are not many tutorials on the wiki that show console… and in my opinion, its one of the coolest things about Ruby and Rails. (but, I come from the php world…)

So, if you aren’t using it… why not? got a moment? try this from the root path of your Rails application.

./script/console
It start up okay? If so, what is the name of one of your models? Let’s say that I have a model structure like:
class Customer < ActiveRecord::Base
  has_many :orders, :dependent => true
end

class Order < ActiveRecord::Base
  belongs_to :customer
end
From console, you can access your models and do all sorts of fun things.
>> y = Customer.find(16)
=> #<Customer:0x2743ea4 @attributes={"name"=>"Robby", "id"=>"16"}>
>> y.orders
=> [#<Order:0x27416b8 @attributes={"id"=>"18", "amount"=>"12.00", "customer_id"=>"16"}>, #<Order:0x274167c @attributes={"id"=>"19", "amount"=>"12.50", "customer_id"=>"16"}>] 

Pretty neat, huh?

>> o = Order.find(18)
=> #<Order:0x273da68 @attributes={"id"=>"18", "amount"=>"12.00", "customer_id"=>"16"}>
>> o.customer.name
=> "Robby" 

If you are remotely a console wizard, please share some tips and tricks for those who are not sure what to do with it. I personally find myself in console all the time that I am working with Rails, testing stuff out with my models, before I move any of the code to my application.

It sure beats, hitting refresh in your browser all day. :-)

Active Record, I <3 U but I still trust my database server (a tiny bit more)

Posted by Fri, 19 Aug 2005 01:20:00 GMT

While working on a portion of my book, I found myself in ./script/console and was seeing some weird issues when I would use has_many and belongs_to.

Let’s take two simple models.

<pre>
class Order < ActiveRecord::Base
  belongs_to :customer
end

class Customer < ActiveRecord::Base
  has_many :orders
end
</pre>
</code>

After a few test records...

<code>
<pre>
test_dev=# SELECT * FROM customers;  
 id |      name      
----+----------------
  1 | Robby
  2 | Nigel
  3 | Linus
(3 rows)

test_dev=# SELECT * FROM orders;
 id | customer_id | amount 
----+-------------+--------
  1 |           1 |  12.00
  2 |           3 |  12.00
(2 rows)
</pre>
</code>

Nothing completely crazy going on, right?

<typo:code lang="ruby">
Loading development environment.
>> Customer.destroy(3)
=> {"name"=>"Linus", "id"=>"3"}
>>     

=# SELECT * FROM orders;
 id | customer_id | amount 
----+-------------+--------
  1 |           1 |  12.00
  3 |           3 |  12.00
(2 rows)

Wait a minute! I just deleted a customer with an id of 3!

So, what is wrong with this scenario? Can you think of any potential problems that could occur from data like this? The record has a customer_id for a customer that does not exist. This is why we have relational databases in the first place, right? :-)

Here is something that I learned today that I was unaware of. Active Record allows you to pass the has_method declaration the option :dependent.

class Customer < ActiveRecord::Base
  has_many :orders, :dependent => true
end

What is this option? Well, according to the AR documentation, “:dependent – if set to true all the associated object are destroyed alongside this object. May not be set if :exclusively_dependent is also set.”

In a nutshell, this works like ON DELETE CASCADE does in PostgreSQL. So, it will through and delete the orders associated with the customer that I was attempting to destroy.

Up until today, I hadn’t broken myself out of the habit of using the built-in constraints/triggers of PostgreSQL. So, as soon as I did, this issue came up and I learned about :dependent.

test_dev# \d orders
                                Table "public.orders" 
   Column    |     Type      |                       Modifiers                        
-------------+---------------+--------------------------------------------------------
 id          | integer       | not null default nextval('public.orders_id_seq'::text)
 customer_id | integer       | 
 amount      | numeric(10,2) | 
Indexes:
    "orders_pkey" PRIMARY KEY, btree (id)
Foreign-key constraints:
    "orders_customer_id_fkey" FOREIGN KEY (customer_id) REFERENCES customers(id)

test_dev=# ALTER TABLE orders DROP CONSTRAINT orders_customer_id_fkey;
ALTER TABLE 

RobbyOnRails:~/Programming/footest robbyrussell$ ./script/console 
Loading development environment.
>> cust = Customer.create(:name => 'Jim')
=> #<Customer:0x275373c @new_record_before_save=true, @new_record=false, @attributes={"name"=>"Jim", "id"=>5}, @errors=#<ActiveRecord::Errors:0x274fa88 @base=#<Customer:0x275373c ...>, @errors={}>>
>> cust.orders.create(:amount => '25.00')
=> #<Order:0x274991c @new_record=false, @attributes={"id"=>4, "amount"=>"25.00", "customer_id"=>5}, @errors=#<ActiveRecord::Errors:0x2746dfc @base=#<Order:0x274991c ...>, @errors={}>>
>>                

test_dev=# SELECT * FROM orders;
 id | customer_id | amount 
----+-------------+--------
  1 |           1 |  12.00
  3 |           4 |  29.00
  4 |           5 |  25.00
(3 rows)

As you can see, I put myself into the hands of Active Record when I ran the DROP CONSTRAINT. Then I tried running the code at the top of the post… and it didn’t work.

According to the docs, if you use :dependent => true, it should delete the foreign table records. If not, it should set the value of the foriegn key field to NULL in the foreign table.

Basically, perform these two SQL queries:
UPDATE orders SET customer_id = NULL WHERE customer_id = 17; 

DELETE FROM customers  WHERE id = 17;

Then, the records are still in the database for those orders, but the customer is deleted. There are arguments for and against doing this sort of thing… but the ability to have the option is always nice. In any event, Active Record would not run the first query,it was just deleting from the customers table. Without my constraint, no error would be returned from PostgreSQL and I started to get some bad data.

Imagine showing a list of orders and trying to display the customer name associated with an order that has no linking customer. Doh! If Active Record sets the customer_id to NULL we can at least have some logic to work with this without having to run some fun SQL queries to figure out which orders do and dont have customers. (we want our applications to have clean data!)

Anyhow… Was this a bug? Should Active Record know to update the records to NULL in this case? I figured that is should be handling this task, especially since it was handling cascading deletes when you passed :dependent => true.

However, I didn’t want to prematurely post a bug report, so I began asking around on #rubyonrails (irc.freenode.net). People made a bunch of suggestions as to how to work around it. I could add a before_destroy method in my model, track the bug down, (re)add an ON DELETE trigger to my table (hah), etc. So, I decided that I would see if I could track down what happens when has_many is used for a model upon #destroy.

After a while of digging and making some tests, I posted a patch and a bug report. (please disregard my first patch… it did not work! heh)

Now that I figured this out, I am going to happy add my constraints back to my tables and go back to playing around. This reminded me of a post I had a few months ago when I mentioned that I thought it was best to put some constraints and logic in the database. I also agree that constraints should be put in the abstraction layer, but we cannot always put all faith in our code either. A few levels of checks doesn’t hurt. :-)

This was a fun little riddle that I took on today. The moral of the story? If you have the ability to use the builtin referential integrity features of PostgreSQL and those other databases, it might be a good idea to do so. Things get overlooked, people login to the database in many ways, and from different programs.

UPDATE: DHH responded to this post and provided a link which discusses Application Database versus Integrated Database

It should be noted that there is an important distinction between the two methods. When I said, “Things get overlooked, people login to the database in many ways, and from different programs” I was basically describing Integration Database. However, I was also thinking of the possibility of someone opening up their MySQL or PostgreSQL GUI and manually removing a record in plain SQL. According to Application Database, the moment that you do that, you basically break this model and cannot expect your application to be fully responsible for the problems that may or may not occur. At this point, you would need to look at your application in terms of Integration Database. Please do correct me if I am wrong on this. :-)

However, with this scenario, my first attempt to move to relying on AR had a minor hiccup, but it was an easy enough fix.

By performing the following command, I am moving towards an Integration Database pattern and that should be recognized when taking this into consideration.

ALTER TABLE orders ADD CONSTRAINT orders_customers_id_fkey 
    FOREIGN KEY (customer_id) REFERENCES customers (id) MATCH FULL; 

Okay, back to work!

Once again. Use constraints! (if you can)

... and thanks to DHH for providing the link and motivating me to make a note of this in my entry.

Older posts: 1 2 3 4