Pepper20 asks: "Okay, now that the big boys in the database game have ported their
software to Linux, what is a smart choice for a web site backend or small business?
With all the choices out there (Sybase, Informix, Oracle, mSQL, mySql, etc... ),
what would you all recommend, and why?"
Database... :
well... here at work, we are currently on a path to support ~60,000 clients
concurrently on browser based applications. The server, linux, the database MySQL.
It does NOT have nice things like transaction/rollbacks and sub-selects, but I have
not found ANYTHING that I couldn't code around. For MASSIVE raw speed, data mining,
data warehousing, you can't beat MySQL.
MySQL here too :
I've been doing all my WebDBase programming in PHP and MySQL. It is missing some
fairly standard SQL database features (views among them), but it's extremely fast
and relatively light on system resources (try running Oracle sometime and you'll
see what I mean). I am not doing very sophisticated queries so more involved
database programming may get stuck on what MySQL lacks, but it is overall an
exceptional engine and it's free for non-commercial use. With Apache and PHP3 in
toe, it's been exceptionally easy to manipulate databases online. And, after you've
been doing queries in MS SQL Server 6.5 long enough, the speed difference is a
revelation.
Personally, I lile PostGreSQL :
But I won't fault MySQL, either. Since both are gratis for testing, I would suggest
testing both.
Sybase definitely :
If you're looking at commercial grade database solutions, your best bet is sybase.
It's full featured, and the single process, simulated thread architecture is a lot
faster than oracle or informix on low end servers.
Solid here :
We're using Solid. It's a fine dbms with all the
basic features (transactions, backups) , easy to install, easy to maintain, not too
expensive, all platforms you need (you can migrate by just copying the data files)
and it's not that expensive (Ora, IBM want big $$ for every user)
Support is fast and friendly.
PostgreSQL vs. MySQL vs. Oracle vs...
As part of my final year degree project, I have to answer this very question.
This is the unification of Cambridge University Engineering Dept's databases. At
present these are held on flat text files, Windows based databases, and an X.500
system. We'd like them in a Unix database, and use ODBC to allow windows clients to
connect. We're talking about a department in the the region of 2500 people.
I've tested a number of databases:
* PostgreSQL 6.4
* MySQL
* mSQL 2
* Oracle 8.0.5
* Ingres 2
* Informix
My final results are far too verbose to show here, but here's a quick summary:
* Commercial databases: For our purposes, these are totally unnecessary. Sure,
all the management tools that come with Oracle are nice, but if you can write
Perl, you can achieve the same, albeit with some more work.
* mSQL: MySQL is derived from this, and it offers no advantages over MySQL.
* MySQL: There's one reason and one reason only to choose this: Performance,
especially reading performance. However, the cost of this is very high - none
of the features that make databases usable with complex data structures, etc.
In particular, the lack of transactions verges on criminal negligence.
* PostgreSQL: This is the one we're going for. It supports everything you'd
expect from a modern database. In particular, inherited tables allow you to do
many things that the commercial databases have, such as per column access
control. Performance is adequate, but not fantastic. In our application, this
isn't terribly important - but I wouldn't want to run Slashdot on PostgreSQL.
In the end it all boils down to horses for courses. I suggest you install
PostgreSQL and MySQL, and write some C/Perl/Whatever scripts to benchmark their
performances with whatever data structures you're going to use. If PostgreSQL gives
adequate performance, use it, as you will sure as fate find it's features useful in
future. If not, MySQL.
Of course, you should take what I've just written with a pinch of salt if you have
unusual data requirements, which may dictate one of the commercial databases.
It ain't Oracle :
I'm an Oracle DBA by profession, and I'd have to say that almost nobody really
needs Oracle on *any* platform. It's EXTREMELY fast when properly tuned and has
just about all the functionality that anybody really needs. However, it's very
difficult to tune right since it has literally hundreds of adjustable parameters,
some of which are undocumented. It's not a database that you can set up and forget
about...you have to fiddle with it pretty frequently, or right dozens of scripts to
do the fiddling for you.
99% of the DB programs out there either use auto-commit or a simulation thereof.
Because of that, transactional stuff is wasted on most people.
I think MySQL lacks other important functionality, so I favor PostgreSQL. However,
etiher one is a pretty good database for web use (most of the time).
Depends on your needs :
I've looked at a couple databases for web based genetics applications at UT-Houston
- your choice depends quite a bit on what your needs are.
For raw speed, nothing beats MySQL on the benchmarks. This comes at a price,
though, as any functionality that might negatively impact best-case speed (e.g.
triggers, transactions) as well as other useful capabilities (sub-selects, views)
are missing. Administration is easy, and DBD, JDBC, etc. drivers are quite solid.
On the other hand, if "number of cool capabilities" is what you're looking for,
Oracle is quite good. We're using an Oracle backend as the master database
(probably for other tasks as well), with a MySQL database on the webserver itself.
Oracle seems to have an enormous learning curve, is a PITA for a few admin tasks,
and would almost certainly be overkill for small databases.
PostgreSQL has some nice GUI admin apps, is totally free, but is neither as fast as
MySQL nor as full-featured as Oracle.
Most of the commercial DBs with Linux ports came out shortly after we had the
MySQL/Oracle setup running - DB2 looks promising.
If you can code around it's shortcomings, use MySQL (but read the license - you
can't distribute it even with your app or use the Windows port for free)... and if
you run into a stumbling block in the future, you shouldn't have difficulty
upgrading to another DB.
Make sure you stick with a database-independent API (and as portable SQL as
feasable) so you're not locked into one vendor - we're using Perl/DBI (probably
mod_perl with persistent connections later) but JDBC or ODBC both have drivers for
everything as well
Phillip Greenspun's useful database book & site :
Phillip Greenspun wrote what I consider to be the most sensible book on
database-backed web sites available .. the book is called, uh, "Database Backed Web
Sites". It's out of print, but he's coming out with a new edition called "Phillip
and Alex's Guide to Web Publishing". (Alex is his dog. You'll find pictures of him
all over Phillip's site. In addition to writing about the web, databases, and
collaborative technologies, Phillip is a pretty good photographer.)
That book isn't out yet, but the good news is that the *entire text* of that book
is already on-line at http://www.photo.net/wtr .
That site also has an excellent user forum on web-backed databases.
Phillip favors Oracle, but he also pushes Solid. He's down on mysql for the same
reason others have mentioned - no transactions. If you want to know why, go check
out his book.
NOT ORACLE unless you're Citibank... :
Let's put some bullshit to rest here, shall we?
1) MySQL can simulate the important part of transactions, namely, the atomicity of
an update. Requesting a table lock for the duration of a multi-statement update,
and releasing it at the end, is what you have to do. Who the hell uses rollback,
anyways?
2) PostGreSQL is a pleasant database to use, but it's slow, and I can't for the
life of me wrap my mind around the code. There are lots of spatial and
object-oriented constructs in there which you don't need and which, in conjuction
with totally- portable transactions, make it relatively slow.
3) Oracle is a bear to install, tune, and make behave. However, with enough memory
on the server and enough competence in the DBA, it will scale from here to
eternity. I have to administer an Oracle database as the backend of my company's
corporate-infrastructure web application and it's not a pleasant job. Maybe if I
could get mod_jserv to work on my server, I'd be singing a different tune -- the
JDBC driver for Oracle is pretty nice. Unfortunately, it uses the TCP/IP listener,
which does not scale as well as using the OCI interface (via Perl/DBI/Apache::DBI)
does.
With the tweaking I do to my scripts and server, I've managed to get dynamic pages
back to the user as fast as static pages (with MySQL; slightly slower for Oracle).
For the amount of grief it has caused me, and the crappy tool SQL*plus is (I have
lots of DBI scripts to dump tables in a readable manner), I disdain Oracle.
Sybase was much more pleasant to work with, however the driver support for Sybase
on Linux is pathetic. Every molecular biology concern I know of seems to use Sybase
on Solaris, where their JDBC driver is fully supported, and the replication
features of Sybase allow the DBAs to sleep quite soundly. On Linux, though, it
sucks.
Do yourself a favor and buy a MySQL license, or help Monty hack subselects and
atomic operations (pseudo-transactions, really) into the MySQL code. The world will
be a better place when there isn't a reason to use Oracle anymore.
And if you use JDBC + servlets, use connection pooling and caching -- mail me if
you care. Servlets and Java Server Pages can obsolete ASP altogether if we work
with Sun and they work with us... otherwise, delenda est Sun Microsystems!
Article on choosing a web database :
I've written an article on choosing a web database (and a server platform) over at
ahref.com ( http://www.ahref.com/guides/technology/199808/0824piou.htm ).
Among other things, it mentions something I don't think anyone's hit on yet: as
nice and fast as MySQL is, it doesn't support stored procedures, which are a very
nice, language-independent way of storing a series of actions you want to take with
your database. With stored procedures, recoding the programs accessing your db gets
a lot easier. That being said, I do use MySQL for most sites I work on. This may
change soon, though...