a few fixes per emails

clean up PostgreSQL section
This commit is contained in:
jcole@tetra.spaceapes.com 2001-08-22 01:57:48 -05:00
parent aac4bdef3b
commit 9330051234

View file

@ -5002,25 +5002,24 @@ users.
@cindex PostgreSQL vs. MySQL, overview
When reading the following, please note that both products are
continually evolving. We at MySQL AB and the PostgreSQL
developers are both working on making our respective database as good as
possible, so we are both a serious choice to any commercial database.
When reading the following, please note that both products are continually
evolving. We at MySQL AB and the PostgreSQL developers are both working
on making our respective database as good as possible, so we are both a
serious choice to any commercial database.
The following comparison is made by us at MySQL AB. We have tried to be
as accurate and fair as possible, but because we don't have a full
knowledge of all PostgreSQL features while we know MySQL througly, we
may have got some things wrong. We will however correct these when they
may have got some things wrong. We will however correct these when they
come to our attention.
We would first like to note that @code{PostgreSQL} and MySQL
are both widely used products, but with different design goals, even if
we are both striving to be ANSI SQL compatible. This means that for
some applications MySQL is more suitable and for others
@code{PostgreSQL} is more suitable. When choosing which database to
use, you should first check if the database's feature set satisfies your
application. If you need speed, MySQL is probably your best
choice. If you need some of the extra features that only @code{PostgreSQL}
We would first like to note that PostgreSQL and MySQL are both widely used
products, but with different design goals, even if we are both striving to
be ANSI SQL compatible. This means that for some applications MySQL is
more suited, while for others PostgreSQL is more suited. When choosing
which database to use, you should first check if the database's feature set
satisfies your application. If you need raw speed, MySQL is probably your
best choice. If you need some of the extra features that only PostgreSQL
can offer, you should use @code{PostgreSQL}.
@cindex PostgreSQL/MySQL, strategies
@ -5038,15 +5037,15 @@ When adding things to MySQL we take pride to do an optimal, definite
solution. The code should be so good that we shouldn't have any need to
change it in the foreseeable future. We also do not like to sacrifice
speed for features but instead will do our utmost to find a solution
that will give maximal throughput. This means that development will take
that will give maximal throughput. This means that development will take
a little longer, but the end result will be well worth this. This kind
of development is only possible because all server code are checked by
one of a few (currently two) persons before it's included in the
MySQL server.
We at MySQL AB believe in frequent releases to be able to push out new
features quickly to our users. Because of this we do a new small release
about every 3 weeks, which a major branch every year. All releases are
features quickly to our users. Because of this we do a new small release
about every three weeks, and a major branch every year. All releases are
throughly tested with our testing tools on a lot of different platforms.
PostgreSQL is based on a kernel with lots of contributors. In this setup
@ -5057,20 +5056,19 @@ later if there arises a need for this.
Another big difference between MySQL and PostgreSQL is that
nearly all of the code in the MySQL server are coded by developers that
are employed by MySQL AB and are still working on the server code. The
exceptions are the transaction engines and the regexp library.
exceptions are the transaction engines, and the regexp library.
This is in sharp contrast to the PostgreSQL code where the majority of
the code is coded by a big group of people with different backgrounds.
It was only recently that the PostgreSQL developers announced that they
It was only recently that the PostgreSQL developers announced that their
current developer group had finally had time to take a look at all
the code in the current PostgreSQL release.
Both of the above development methods has it's own merits and drawbacks.
We here at MySQL AB think of course that our model is better
because our model gives better code consistence, more optimal and
reusable code and, in our opinion, fewer bugs. Because we are the
authors of the MySQL server code we are better able to
coordinate new features and releases.
We here at MySQL AB think of course that our model is better because our
model gives better code consistency, more optimal and reusable code, and
in our opinion, fewer bugs. Because we are the authors of the MySQL server
code, we are better able to coordinate new features and releases.
@node MySQL-PostgreSQL features, MySQL-PostgreSQL benchmarks, MySQL-PostgreSQL goals, Compare PostgreSQL
@ -5082,7 +5080,7 @@ On the @uref{http://www.mysql.com/information/crash-me.php, crash-me}
page you can find a list of those database constructs and limits that
one can detect automatically with a program. Note however that a lot of
the numerical limits may be changed with startup options for respective
database. The above web page is however extremely useful when you want to
database. The above web page is however extremely useful when you want to
ensure that your applications works with many different databases or
when you want to convert your application from one datbase to another.
@ -5092,203 +5090,230 @@ MySQL offers the following advantages over PostgreSQL:
@item
@code{MySQL} is generally much faster than PostgreSQL.
@xref{MySQL-PostgreSQL benchmarks}.
@item
Because MySQL has a much larger user base than PostgreSQL the
MySQL has a much larger user base than PostgreSQL, therefor the
code is more tested and has historically been more stable than
PostgreSQL. MySQL is the much more used in production
PostgreSQL. MySQL is the much more used in production
environments than PostgreSQL, mostly thanks to that MySQL AB,
former TCX DataKonsult AB, has provided top quality commercial support
formerly TCX DataKonsult AB, has provided top quality commercial support
for MySQL from the day it was released, whereas until recently
PostgreSQL was unsupported.
@item
MySQL works on more platforms than PostgreSQL. @xref{Which OS}.
@item
MySQL works better on Windows; MySQL is running as a
native windows application (a service on NT/Win2000/WinXP), while
PostgreSQL is run under the cygwin emulation. We have heard that
PostgreSQL is not yet that stable on windows but we haven't been able to
MySQL works better on Windows than PostgreSQL does. MySQL runs as a
native Windows application (a service on NT/Win2000/WinXP), while
PostgreSQL is run under the cygwin emulation. We have heard that
PostgreSQL is not yet that stable on Windows but we haven't been able to
verify this ourselves.
@item
MySQL has more API to other languages and is supported by more
programs than PostgreSQL. @xref{Contrib}.
MySQL has more APIs to other languages and is supported by more
existing programs than PostgreSQL. @xref{Contrib}.
@item
MySQL works on 24/7 heavy duty systems. In most circumstances
MySQL works on 24/7 heavy duty systems. In most circumstances
you never have to run any cleanups on @code{MySQL}. PostgreSQL doesn't
yet support 24/7 systems because you have have to run @code{vacuum()}
yet support 24/7 systems because you have to run @code{vacuum()}
once in a while to reclaim space from @code{UPDATE} and @code{DELETE}
commands and to perform statistics analyzes that are critical to get
good performance with PostgreSQL. Vacuum is also needed after adding
a lot of new rows to a table. On a busy system with lots of changes
good performance with PostgreSQL. Vacuum is also needed after adding
a lot of new rows to a table. On a busy system with lots of changes,
vacuum must be run very frequently, in the worst cases even many times a
day. During the @code{vacuum()} run, which may take hours if the
database is big, the database is from a production standpoint
database is big, the database is from a production standpoint,
practically dead. The PostgreSQL team has fixing this on their TODO,
but we assume that this is not an easy thing to fix permanently.
@item
A working, tested replication feature used by sites like
@uref{http://finance.yahoo.com, Yahoo finance},
@uref{http://www.mobile.de/,mobile.de} and
@uref{http://www.slashdot.org,Slashdot}.
@itemize @minus
@item Yahoo Finance (@uref{http://finance.yahoo.com})
@item Mobile.de (@uref{http://www.mobile.de/})
@item Slashdot (@uref{http://www.slashdot.org})
@end itemize
@item
Included in the MySQL distribution is included two different
testing suits (@file{mysql-test-run} and
@uref{http://www.mysql.com/information/crash-me.php,crash-me}) and a
benchmark suite. The test system is actively updated with code to test
each new feature and almost all repeatable bugs that comes to our
attention. We test MySQL with these on a lot of platforms
before every release. These tests are more sofisticated than anything
have seen from PostgreSQL and ensures that the MySQL code keeps
at a high standard.
Included in the MySQL distribution are two different testing suites,
@file{mysql-test-run} and
@uref{http://www.mysql.com/information/crash-me.php,crash-me}, as well
as a benchmark suite. The test system is actively updated with code to
test each new feature and almost all repeatable bugs that have come to
our attention. We test MySQL with these on a lot of platforms before
every release. These tests are more sophisticated than anything we have
seen from PostgreSQL, and they ensures that the MySQL is kept to a high
standard.
@item
There are far moore books in print on MySQL than on PostgreSQL.
There are far more books in print about MySQL than about PostgreSQL.
O'Reilly, Sams, Que, and New Riders are all major publishers with books
about MySQL. All MySQL features is also documented in the
MySQL on-line manual because when a feature is implemented, the
MySQL developers are required to document it before it's
included in the source.
about MySQL. All MySQL features are also documented in the MySQL on-line
manual, because when a new feature is implemented, the MySQL developers
are required to document it before it's included in the source.
@item
MySQL has supports more of the standard ODBC functions than
@code{PostgreSQL}.
MySQL supports more of the standard ODBC functions than @code{PostgreSQL}.
@item
MySQL has a much more sophisticated @code{ALTER TABLE}.
@item
MySQL has support for tables without transactions for
applications that need all speed they can get. The tables may be memory
based,@code{HEAP} tables or disk based @code{MyISAM}. @xref{Table types}.
MySQL has support for tables without transactions for applications that
need all speed they can get. The tables may be memory based, @code{HEAP}
tables or disk based @code{MyISAM}. @xref{Table types}.
@item
MySQL has support for 3 different table handles that support
transactions (@code{BDB} and @code{InnoDB}). Because
every transaction engine performs differently under different
conditions, this gives the application writer more options to find an
optimal solution for his/her setup. @xref{Table types}.
MySQL has support for two different table handlers that support
transactions, @code{BerkeleyDB} and @code{InnoDB}. Because every
transaction engine performs differently under different conditions, this
gives the application writer more options to find an optimal solution for
his or her setup. @xref{Table types}.
@item
@code{MERGE} tables gives you a unique way to instantly make a view over
a set of identical tables and use these as one. This is perfectly for
a set of identical tables and use these as one. This is perfect for
systems where you have log files that you order for example by month.
@xref{MERGE}.
@item
The option to compress read-only tables, but still have direct access to
the rows in the table, gives you better performance by minimizing disk
reads. This is very useful when you are archiving
things.@xref{myisampack}.
reads. This is very useful when you are archiving things.
@xref{myisampack}.
@item
MySQL has internal support for text search. @xref{Fulltext Search}.
MySQL has internal support for fulltext search. @xref{Fulltext Search}.
@item
You can access many databases from the same connection (depending of course
on your privileges).
@item
MySQL is coded from the start with multi-threading while
PostgreSQL uses processes. Because context switching and access to
common storage areas is much faster between threads, than are separate
processes, this gives MySQL a big speed advantage in multi-user
applications and also makes it easier for MySQL to take full
advantage of symmetric multiprocessor systems (SMP).
MySQL is coded from the start to be multi-threaded while PostgreSQL uses
processes. Context switching and access to common storage areas is much
faster between threads than between separate processes, this gives MySQL
a big speed advantage in multi-user applications and also makes it easier
for MySQL to take full advantage of symmetric multiprocessor (SMP) systems.
@item
MySQL has a much more sophisticated privilege system than
PostgreSQL. While PostgreSQL only supports @code{INSERT},
@code{SELECT}, @code{update/delete} grants per user on a database or a
table MySQL allows you to define a full set of different
privileges on database, table and columns level. MySQL also allows
you to specify the privilege on host+user combinations. @xref{GRANT}.
MySQL has a much more sophisticated privilege system than PostgreSQL.
While PostgreSQL only supports @code{INSERT}, @code{SELECT}, and
@code{UPDATE/DELETE} grants per user on a database or a table, MySQL allows
you to define a full set of different privileges on database, table and
column level. MySQL also allows you to specify the privilege on host and
user combinations. @xref{GRANT}.
@item
MySQL supports a compressed server/client protocol which
improves performance over slow links.
MySQL supports a compressed client/server protocol which improves
performance over slow links.
@item
MySQL employs the table handler concept and is the only
relational database we know of built around this concept. This allows
different low level table types to be swapped into the SQL engine, each
table type optimized for a different performance characteristics.
MySQL employs a ``table handler'' concept, and is the only relational
database we know of built around this concept. This allows different
low-level table types to be swapped into the SQL engine, and each table
type can be optimized for different performance characteristics.
@item
All @code{MySQL} table types (except @strong{InnoDB}) are implemented as
files (ie: one table per file), which makes it really easy to backup,
move, delete and even symlink databases and tables when the server is
down.
All MySQL table types (except @strong{InnoDB}) are implemented as files
(one table per file), which makes it really easy to backup, move, delete
and even symlink databases and tables, even when the server is down.
@item
Tools to repair and optimize @strong{MyISAM} tables (the most common
MySQL table type). A repair tool is only needed when a
physical corruption of a data file happens, usually from a hardware
failure. It allows a majority of the data to be recovered.
MySQL table type). A repair tool is only needed when a physical corruption
of a data file happens, usually from a hardware failure. It allows a
majority of the data to be recovered.
@item
Upgrading MySQL is painless. When you are upgrading MySQL,
you don't need to dump/restore your data, as you have to do with most
PostgreSQL upgrades.
Upgrading MySQL is painless. When you are upgrading MySQL, you don't need
to dump/restore your data, as you have to do with most PostgreSQL upgrades.
@end itemize
Drawbacks with MySQL compared to PostgreSQL:
@itemize @bullet
@item
The transaction support in MySQL is not yet as well tested as
PostgreSQL's system.
The transaction support in MySQL is not yet as well tested as PostgreSQL's
system.
@item
Because MySQL uses threads, which are still a moving target on
many OS, one must either use binaries from
@uref{http://www.mysql.com/downloads} or carefully follow our
instructions on
Because MySQL uses threads, which are not yet flawless on many OSes, one
must either use binaries from @uref{http://www.mysql.com/downloads}, or
carefully follow our instructions on
@uref{http://www.mysql.com/doc/I/n/Installing_source.html} to get an
optimal binary that works in all cases.
@item
Table locking, as used by the non-transactional @code{MyISAM} tables, is
in many cases faster than page locks, row locks or versioning. The
drawback however is that if one doesn't take into account how table
locks works, a single long-running query can block a table for updates
locks work, a single long-running query can block a table for updates
for a long time. This can usable be avoided when designing the
application. If not, one can always switch the trouble table to use one
of the transactional table types. @xref{Table locking}.
@item
With UDF (user defined functions) one can extend MySQL with
both normal SQL functions and aggregates, but this is not as easy or as
flexible as in PostgreSQL. @xref{Adding functions}.
With UDF (user defined functions) one can extend MySQL with both normal
SQL functions and aggregates, but this is not yet as easy or as flexible
as in PostgreSQL. @xref{Adding functions}.
@item
Updates and deletes that goes over multiple tables is harder to do in
MySQL. (Will be fixed in MySQL 4.0 with multi-table
Updates and deletes that run over multiple tables is harder to do in
MySQL. This will, hoever, be fixed in MySQL 4.0 with multi-table
@code{DELETE} and multi-table @code{UPDATE} and in MySQL 4.1
with @code{SUB-SELECT})
with subselects.
@end itemize
PostgreSQL offers currently the following advantages over MySQL:
PostgreSQL currently offers the following advantages over MySQL:
Note that because we know the MySQL road map, we have included
in the following table the version when MySQL should support
this feature. Unfortunately we couldn't do this for previous comparison,
because we don't know the PostgreSQL roadmap.
Note that because we know the MySQL road map, we have included in the
following table the version when MySQL should support this feature.
Unfortunately we couldn't do this for previous comparison, because we
don't know the PostgreSQL roadmap.
@multitable @columnfractions .70 .30
@item @strong{Feature} @tab @strong{MySQL version}
@item Subselects @tab 4.1
@item Foreign keys @tab 4.0 and 4.1
@item Views. @tab 4.2
@item Stored procedures in multiple languages @tab 4.1
@item Extensible type system. @tab Not planed
@item Unions @tab 4.0.
@item Full join. @tab 4.0 or 4.1.
@item Triggers. @tab 4.1
@item Constrainst @tab 4.1
@item Cursors @tab 4.1 or 4.2
@item Extensible index types like R-trees @tab R-trees are planned to 4.2
@item Inherited tables @tab Not planned
@item @strong{Feature} @tab @strong{MySQL version}
@item Subselects @tab 4.1
@item Foreign keys @tab 4.0 and 4.1
@item Views @tab 4.2
@item Stored procedures @tab 4.1
@item Extensible type system @tab Not planned
@item Unions @tab 4.0
@item Full join @tab 4.0 or 4.1
@item Triggers @tab 4.1
@item Constrainst @tab 4.1
@item Cursors @tab 4.1 or 4.2
@item Extensible index types like R-trees @tab R-trees are planned for 4.2
@item Inherited tables @tab Not planned
@end multitable
Other reasons to use PostgreSQL:
@itemize @bullet
@item
Standard usage is in PostgreSQL closer to ANSI SQL in some cases.
Standard usage in PostgreSQL is closer to ANSI SQL in some cases.
@item
One can get speed up PostgreSQL by coding things as stored procedures.
One can speed up PostgreSQL by coding things as stored procedures.
@item
Bigger team of developers that contributes to the server.
PostgreSQL has a bigger team of developers that contribute to the server.
@end itemize
Drawbacks with PostgreSQL compared to MySQL:
@itemize @bullet
@item
@code{Vaccum()} makes PostgreSQL hard to use in a 24/7 environment.
@code{VACUUM()} makes PostgreSQL hard to use in a 24/7 environment.
@item
Only transactional tables.
@item
Much slower insert/delete/update.
Much slower @code{INSERT}, @code{DELETE}, and @code{UPDATE}.
@end itemize
For a complete list of drawbacks, you should also examine the first table
@ -5304,88 +5329,83 @@ in this section.
@cindex PostgreSQL vs. MySQL, benchmarks
The only open source benchmark, that we know of, that can be used to
benchmark MySQL and PostgreSQL (and other databases) is our
own. It can be found at:
@uref{http://www.mysql.com/information/benchmarks.html}.
The only open source benchmark that we know of that can be used to
benchmark MySQL and PostgreSQL (and other databases) is our own. It can
be found at @uref{http://www.mysql.com/information/benchmarks.html}.
We have many times asked the PostgreSQL developers and some PostgreSQL
users to help us extend this benchmark to make the definitive benchmark
for databases, but unfortunately we haven't got any feedback for this.
users to help us extend this benchmark to make it the definitive benchmark
for databases, but unfortunately we haven't gotten any feedback for this.
We, the MySQL developers, have because of this spent a lot of
hours to get maximum performance from PostgreSQL for the benchmarks, but
because we don't know PostgreSQL intimately we are sure that there are
things that we have missed. We have on the benchmark page documented
exactly how we did run the benchmark so that it should be easy for
anyone to repeat and verify our results.
We the MySQL developers have, because of this, spent a lot of hours to get
maximum performance from PostgreSQL for the benchmarks, but because we
don't know PostgreSQL intimately, we are sure that there are things that
we have missed. We have on the benchmark page documented exactly how we
did run the benchmark so that it should be easy for anyone to repeat and
verify our results.
The benchmarks are usually run with and without the @code{--fast}
option. When run with @code{--fast} we are trying to use every trick
the server can do to get the code to execute as fast as possible.
The idea is that the normal run should show how the server would work in
a default setup and the @code{--fast} run shows how the server would do
if the application developer would use extensions in the server to make
his application run faster.
The benchmarks are usually run with and without the @code{--fast} option.
When run with @code{--fast} we are trying to use every trick the server can
do to get the code to execute as fast as possible. The idea is that the
normal run should show how the server would work in a default setup and
the @code{--fast} run shows how the server would do if the application
developer would use extensions in the server to make his application run
faster.
When running with PostgreSQL and @code{--fast} we do a @code{vacuum()}
between after every major table update and drop table to make the database
in perfect shape for the following selects. The time for vacuum() is
measured separately.
When running with PostgreSQL and @code{--fast} we do a @code{VACUUM()}
after every major table @code{UPDATE} and @code{DROP TABLE} to make the
database in perfect shape for the following @code{SELECT}s. The time for
@code{VACUUM()} is measured separately.
When running with PostgreSQL 7.1.1 we could however not run with
@code{--fast} because during the insert test, the postmaster (the
When running with PostgreSQL 7.1.1 we could, however, not run with
@code{--fast} because during the @code{INSERT} test, the postmaster (the
PostgreSQL deamon) died and the database was so corrupted that it was
impossible to restart postmaster. (The details about the machine we run
the benchmark can be found on the benchmark page). After this happened
twice, we decided to postpone the @code{--fast} test until next
PostgreSQL release.
impossible to restart postmaster. After this happened twice, we decided
to postpone the @code{--fast} test until next PostgreSQL release. The
details about the machine we run the benchmark can be found on the
benchmark page.
Before going to the other benchmarks we know of, We would like to give
some background to benchmarks:
Before going to the other benchmarks we know of, we would like to give
some background on benchmarks:
It's very easy to write a test that shows ANY database to be best
database in the world, by just restricting the test to something the
database is very good at and not test anything that the database is not
good at; If one after this publish the result with a single figure
things is even easier.
good at. If one after this publishes the result with a single figure,
things are even easier.
This would be like we would measure the speed of MySQL compared
to PostgreSQL by looking at the summary time of the MySQL benchmarks on
our web page. Based on this MySQL would be more than 40 times
faster than PostgreSQL, something that is of course not true. We could
make things even worse by just taking the test where PostgreSQL performs
worst and claim that MySQL is more than 2000 times faster than
PostgreSQL.
This would be like us measuring the speed of MySQL compared to PostgreSQL
by looking at the summary time of the MySQL benchmarks on our web page.
Based on this MySQL would be more than 40 times faster than PostgreSQL,
something that is of course not true. We could make things even worse
by just taking the test where PostgreSQL performs worst and claim that
MySQL is more than 2000 times faster than PostgreSQL.
The case is that MySQL does a lot of optimizations that
PostgreSQL doesn't do and the other way around. An SQL optimizer is a
very complex thing and a company could spend years on just making the
optimizer faster and faster.
The case is that MySQL does a lot of optimizations that PostgreSQL doesn't
do and the other way around. An SQL optimizer is a very complex thing, and
a company could spend years on just making the optimizer faster and faster.
When looking at the benchmark results you should look for things that
you do in your application and just use these results to decide which
database would be best suited for your application. The benchmark
database would be best suited for your application. The benchmark
results also shows things a particular database is not good at and should
give you a notion about things to avoid and what you may have to do in
other ways.
We know of two benchmark tests that claims that PostgreSQL performers
better than MySQL. These both where multi-user tests, a test
that we here at MySQL AB haven't had time to write and include in
the benchmark suite, mainly because it's a big task to do this in a
manner that is fair against all databases.
We know of two benchmark tests that claims that PostgreSQL performs better
than MySQL. These both where multi-user tests, a test that we here at
MySQL AB haven't had time to write and include in the benchmark suite,
mainly because it's a big task to do this in a manner that is fair against
all databases.
One is the benchmark paid for by
@uref{http://www.greatbridge.com/about/press.php?content_id=4,Great
Bridge}.
One is the benchmark paid for by Great Bridge, which you can read about at:
@uref{http://www.greatbridge.com/about/press.php?content_id=4}.
This is the worst benchmark we have ever seen anyone ever conduct. This
was not only tuned to only test what PostgreSQL is absolutely best at,
it was also totally unfair against every other database involved in the
test.
This is the probably worst benchmark we have ever seen anyone conduct. This
was not only tuned to only test what PostgreSQL is absolutely best at, it
was also totally unfair against every other database involved in the test.
@strong{NOTE}: We know that not even some of the main PostgreSQL
@strong{NOTE}: We know that not even some of the main PostgreSQL
developers did like the way Great Bridge conducted the benchmark, so we
don't blame them for the way the benchmark was made.
@ -5394,98 +5414,115 @@ we will here just shortly repeat some things that where wrong with it.
@itemize @bullet
@item
The tests where run with an expensive commercial tool, that makes it
The tests were run with an expensive commercial tool, that makes it
impossible for an open source company like us to verify the benchmarks,
or even check how the benchmark where really done. The tool is not even
a true benchmark tool, but a application/setup testing tool. To refer
this as STANDARD benchmark tool is to stretch the truth a long way.
or even check how the benchmarks were really done. The tool is not even
a true benchmark tool, but an application/setup testing tool. To refer
this as a ``standard'' benchmark tool is to stretch the truth a long way.
@item
Great Bridge admitted that they had optimized the PostgreSQL database
(with vacuum() before the test) and tuned the startup for the tests,
(with @code{VACUUM()} before the test) and tuned the startup for the tests,
something they hadn't done for any of the other databases involved. To
say "This process optimizes indexes and frees up disk space a bit. The
optimized indexes boost performance by some margin". Our benchmarks
clearly indicates that the difference in running a lot of selects on a
database with and without vacuum() can easily differ by a factor of 10.
@item
The test results where also strange; The AS3AP test documentation
mentions that the test does:
say ``This process optimizes indexes and frees up disk space a bit. The
optimized indexes boost performance by some margin.'' Our benchmarks
clearly indicate that the difference in running a lot of selects on a
database with and without @code{VACUUM()} can easily differ by a factor
of ten.
"selections, simple joins, projections, aggregates, one-tuple updates,
and bulk updates"
PostgreSQL is good at doing selects and joins (especially after a
vacuum()), but doesn't perform as well on inserts/updates; The
benchmarks seem to indicate that only SELECTs where done (or very few
updates) . This could easily explain they good results for PostgreSQL in
this test. The bad results for MySQL will be obvious a bit down in this
document.
@item
They did run the so called benchmark from a Windows machine against a
The test results were also strange. The AS3AP test documentation
mentions that the test does ``selections, simple joins, projections,
aggregates, one-tuple updates, and bulk updates''.
PostgreSQL is good at doing @code{SELECT}s and @code{JOIN}s (especially
after a @code{VACUUM()}), but doesn't perform as well on @code{INSERT}s or
@code{UPDATE}s. The benchmarks seem to indicate that only @code{SELECT}s
were done (or very few updates). This could easily explain they good results
for PostgreSQL in this test. The bad results for MySQL will be obvious a
bit down in this document.
@item
They did run the so-called benchmark from a Windows machine against a
Linux machine over ODBC, a setup that no normal database user would ever
do when running a heavy multi-user application. This tested more the
do when running a heavy multi-user application. This tested more the
ODBC driver and the Windows protocol used between the clients than the
database itself.
@item
When running the database against Oracle and MS-SQL (Great Bridge has
indirectly indicated that the databases they used in the test),
they didn't use the native protocol but instead ODBC. Anyone that has
ever used Oracle, knows that all real application uses the native
interface instead of ODBC. Doing a test through ODBC and claiming that
the results had anything to do with using the database for real can't
be regarded as fair play. They should have done two tests with and
without ODBC to provide the right facts (after having got experts to tune
all involved databases of course).
indirectly indicated that the databases they used in the test), they
didn't use the native protocol but instead ODBC. Anyone that has ever
used Oracle knows that all real application uses the native interface
instead of ODBC. Doing a test through ODBC and claiming that the results
had anything to do with using the database in a real-world situation can't
be regarded as fair. They should have done two tests with and without ODBC
to provide the right facts (after having got experts to tune all involved
databases of course).
@item
They refer to the TPC-C tests, but doesn't anywhere mention that the
tests they did where not a true TPC-C test and they where not even
allowed to call it a TPC-C test. A TPC-C test can only be conducted by
the rules approved by the @uref{http://www.tpc.org,TPC-council}. Great
Bridge didn't do that. By doing this they have both violated the TPC
trademark and miscredited their own benchmarks. The rules set by the
TPC-council are very strict to ensure that no one can produce false
results or make unprovable statements. Apparently Great Bridge wasn't
interested in doing this.
They refer to the TPC-C tests, but they don't mention anywhere that the
test they did was not a true TPC-C test and they were not even allowed to
call it a TPC-C test. A TPC-C test can only be conducted by the rules
approved by the TPC Council (@uref{http://www.tpc.org}). Great Bridge
didn't do that. By doing this they have both violated the TPC trademark
and miscredited their own benchmarks. The rules set by the TPC Council
are very strict to ensure that no one can produce false results or make
unprovable statements. Apparently Great Bridge wasn't interested in
doing this.
@item
After the first test, we contacted Great Bridge and mentioned to them
some of the obvious mistakes they had done with MySQL; Running
with a debug version of our ODBC driver, running on a Linux system that
wasn't optimized for threads, using an old MySQL version when there was
a recommended newer one available, not starting MySQL with the
right options for heavy multi-user use (the default installation of
MySQL is tuned for minimal resource use). Great Bridge did run a new
test, with our optimized ODBC driver and with better startup options for
MySQL, but refused to either use our updated glibc library or our
standard binary (used by 80% of our users), which was statically linked
with a fixed glibc library.
some of the obvious mistakes they had done with MySQL:
@itemize @minus
@item
Running with a debug version of our ODBC driver
@item
Running on a Linux system that wasn't optimized for threads
@item
Using an old MySQL version when there was a recommended newer one available
@item
Not starting MySQL with the right options for heavy multi-user use (the
default installation of MySQL is tuned for minimal resource use).
@end itemize
Great Bridge did run a new test, with our optimized ODBC driver and with
better startup options for MySQL, but refused to either use our updated
glibc library or our standard binary (used by 80% of our users), which was
statically linked with a fixed glibc library.
According to what we know, Great Bridge did nothing to ensure that the
other databases where setup correctly to run good in their test
environment. We are sure however that they didn't contact Oracle or
other databases were set up correctly to run well in their test
environment. We are sure however that they didn't contact Oracle or
Microsoft to ask for their advice in this matter ;)
@item
The benchmark was paid for by Great Bridge, and they decided to publish
only partial chosen results (instead of publishing it all).
only partial, chosen results (instead of publishing it all).
@end itemize
Tim Perdue, a long time PostgreSQL fan and a reluctant MySQL user
published a comparison on
@uref{http://www.phpbuilder.com/columns/tim20001112.php3,phpbuider}.
When we got aware of the comparison, we phoned Tim Perdue about this
because there was a lot of strange things in his results. For example,
When we became aware of the comparison, we phoned Tim Perdue about this
because there were a lot of strange things in his results. For example,
he claimed that MySQL had a problem with five users in his tests, when we
know that there are users with similar machines as his that are using
MySQL with 2000 simultaneous connections doing 400 queries per second (In
this case the limit was the web bandwidth, not the database).
MySQL with 2000 simultaneous connections doing 400 queries per second.
(In this case the limit was the web bandwidth, not the database.)
It sounded like he was using a Linux kernel that either had some
problems with many threads (Linux kernels before 2.4 had a problem with
this but we have documented how to fix this and Tim should be aware of
this problem). The other possible problem could have been an old glibc
problems with many threads, such as kernels before 2.4, which had a problem
with this but we have documented how to fix this and Tim should be aware of
this problem. The other possible problem could have been an old glibc
library and that Tim didn't use a MySQL binary from our site, which is
linked with a corrected glibc library, but had compiled a version of his
own with. In any of the above cases, the symptom would have been exactly
own with. In any of the above cases, the symptom would have been exactly
what Tim had measured.
We asked Tim if we could get access to his data so that we could repeat
@ -5498,26 +5535,25 @@ Because of this we can't put any trust in this benchmark either :(
Conclusion:
The only benchmarks that exist today that anyone can download and run
against MySQLand PostgreSQL is the MySQL benchmarks. We here
at MySQL believe that open source databases should be tested
with open source tools! This is the only way to ensure that no one
does tests that nobody can reproduce and use this to claim that a
database is better than another. Without knowing all the facts it's
impossible to answer the claims of the tester.
against MySQL and PostgreSQL is the MySQL benchmarks. We here at MySQL
believe that open source databases should be tested with open source tools!
This is the only way to ensure that no one does tests that nobody can
reproduce and use this to claim that a database is better than another.
Without knowing all the facts it's impossible to answer the claims of the
tester.
The thing we find strange is that every test we have seen about
PostgreSQL, that is impossible to reproduce, claims that PostgreSQL is
better in most cases while our tests, which anyone can reproduce,
clearly shows otherwise. With this we don't want to say that PostgreSQL
isn't good at many things (It is!) We would just like to see a fair test
clearly shows otherwise. With this we don't want to say that PostgreSQL
isn't good at many things (it is!). We would just like to see a fair test
where they are very good so that we could get some friendly competition
going!
For more information about our benchmarks suite see @xref{MySQL
Benchmarks}.
For more information about our benchmarks suite @xref{MySQL Benchmarks}.
We are working on an even better benchmark suite, including much better
documentation of what the individual tests really do and how to add more
documentation of what the individual tests really do, and how to add more
tests to the suite.
@ -9617,7 +9653,7 @@ thread stacks to stay away from the global heap. With @code{mysqld}, you
should enforce this "gentleman" behavior by setting a reasonable value for
the @code{max_connections} variable.
If you build MySQL yourself and do not what to mess with patching
If you build MySQL yourself and do not want to mess with patching
LinuxThreads, you should set @code{max_connections} to a value no higher
than 500. It should be even less if you have a large key buffer, large
heap tables, or some other things that make @code{mysqld} allocate a lot
@ -25075,21 +25111,27 @@ when the first row in t2 is found.
@item
The table @code{B} is set to be dependent on table @code{A} and all tables
that @code{A} is dependent on.
@item
The table @code{A} is set to be dependent on all tables (except @code{B})
that are used in the @code{LEFT JOIN} condition.
@item
All @code{LEFT JOIN} conditions are moved to the @code{WHERE} clause.
@item
All standard join optimizations are done, with the exception that a table is
always read after all tables it is dependent on. If there is a circular
dependence then MySQL will issue an error.
@item
All standard @code{WHERE} optimizations are done.
@item
If there is a row in @code{A} that matches the @code{WHERE} clause, but there
wasn't any row in @code{B} that matched the @code{LEFT JOIN} condition,
then an extra @code{B} row is generated with all columns set to @code{NULL}.
@item
If you use @code{LEFT JOIN} to find rows that don't exist in some
table and you have the following test: @code{column_name IS NULL} in the
@ -25112,8 +25154,8 @@ Note that the above means that if you do a query of type:
SELECT * FROM a,b LEFT JOIN c ON (c.key=a.key) LEFT JOIN d (d.key=a.key) WHERE b.key=d.key
@end example
MySQL will do a full scan on @code{b} as the @code{LEFT
JOIN} will force it to be read before @code{d}.
MySQL will do a full scan on @code{b} as the @code{LEFT JOIN} will force
it to be read before @code{d}.
The fix in this case is to change the query to:
@ -32174,8 +32216,13 @@ ON conditional_expr |
USING (column_list)
@end example
Note that in versions before Version 3.23.16, the @code{INNER JOIN} didn't take
a join condition!
You should never have any conditions in the @code{ON} part that are used to
restrict which rows you have in the result set. If you want to restrict
which rows should be in the result, you have to do this in the @code{WHERE}
clause.
Note that in versions before Version 3.23.16, the @code{INNER JOIN} didn't
take a @code{join_condition}!
@cindex ODBC compatibility
@cindex compatibility, with ODBC
@ -32733,7 +32780,7 @@ files have become corrupted.
@example
REPLACE [LOW_PRIORITY | DELAYED]
[INTO] tbl_name [(col_name,...)]
VALUES (expression,...)
VALUES (expression,...),(...),...
or REPLACE [LOW_PRIORITY | DELAYED]
[INTO] tbl_name [(col_name,...)]
SELECT ...