Independent Digital

Creating and managing dynamic websites using Php, MySql, HTML and Wordpress with a bit of PEAR on the side

Entries Comments



Email This Post    Print This Post   

Category: MySQL


Migrating to MySQL. Moving your database to MySQL.

24 May, 2007 (11:41) | MySQL | By: clive

Now you can switch to MySQL quickly and easily at no cost. The MySQL Migration Toolkit enables you to migrate from a number of proprietary databases to MySQL.

MySQL Migration Toolkit

The MySQL Migration Toolkit uses a proven migration methodology that enables you to quickly migrate your Oracle, Microsoft SQL Server, Microsoft Access or other database to MySQL. A Wizard-driven interface walks you through the steps.

MySQL Migration Methodology Reduces Risk

The MySQL Migration Toolkit implements a proven migration methodology that gives you complete control over the migration process.

Intuitive, Easy to Use Environment Improves Productivity

The MySQL Migration Toolkit provides a Wizard driven interface that automates manual migration tasks.

Remote Server to Server Migration

The MySQL Migration Toolkit supports Server to Server migration although you have the flexibility to run the MySQL Migration Toolkit on your desktop while migrating database applications between remote servers.

Migration of Database Objects including Schema, Stored Procedures, Triggers and Views

  • The MySQL Migration Toolkit will migrate or map objects from select proprietary database formats to MySQL
  • The toolkit enables you to use standard migration methods or you can easily create your own using the Java language
  • You can transfer data online in real-time or you can store a snapshot of the data and schedule a future transfer offline

    Extensible Modular Architecture

    The MySQL Migration Toolkit includes multiple migration modules for databases including Oracle, Microsoft SQL Server and Microsoft Access. Users can customize existing modules to migrate data from other proprietary database products

    Optimize the Migration Process

    The MySQL Migration Toolkit enables you to optimize the migration process by manually editing the script of generated objects.

    Standards-based Platform Independent Migration

    The MySQL Migration Toolkit uses a platform independent agent written in Java so you can migrate databases to MySQL on multiple platforms. You also have the flexibility to add native methods written in C.

    Comprehensive Summary Reports

    The MySQL Migration Toolkit generates comprehensive reports that provide detailed information about the number of objects (tables, views, etc) that have been migrated.

    Get the FREE MySQL Migration Toolkit

    MySql and the top reasons for using it

    22 May, 2007 (13:33) | MySQL | By: clive

    If you are considering which database to use then consider these ten reasons why you should choose MySql. The following is an extract form the MySql website and is quoted verbatim:

    Top Reasons to Use MySQL

    1. Scalability and Flexibility

    The MySQL database server provides the ultimate in scalability, sporting the capacity to handle deeply embedded applications with a footprint of only 1MB to running massive data warehouses holding terabytes of information. Platform flexibility is a stalwart feature of MySQL with all flavors of Linux, UNIX, and Windows being supported. And, of course, the open source nature of MySQL allows complete customization for those wanting to add unique requirements to the database server.

    2. High Performance

    A unique storage-engine architecture allows database professionals to configure the MySQL database server specifically for particular applications, with the end result being amazing performance results. Whether the intended application is a high-speed transactional processing system or a high-volume web site that services a billion queries a day, MySQL can meet the most demanding performance expectations of any system. With high-speed load utilities, distinctive memory caches, full text indexes, and other performance-enhancing mechanisms, MySQL offers all the right ammunition for today’s critical business systems.

    3. High Availability

    Rock-solid reliability and constant availability are hallmarks of MySQL, with customers relying on MySQL to guarantee around-the-clock uptime. MySQL offers a variety of high-availability options from high-speed master/slave replication configurations, to specialized Cluster servers offering instant failover, to third party vendors offering unique high-availability solutions for the MySQL database server.

    4. Robust Transactional Support

    MySQL offers one of the most powerful transactional database engines on the market. Features include complete ACID (atomic, consistent, isolated, durable) transaction support, unlimited row-level locking, distributed transaction capability, and multi-version transaction support where readers never block writers and vice-versa. Full data integrity is also assured through server-enforced referential integrity, specialized transaction isolation levels, and instant deadlock detection.

    5. Web and Data Warehouse Strengths

    MySQL is the de-facto standard for high-traffic web sites because of its high-performance query engine, tremendously fast data insert capability, and strong support for specialized web functions like fast full text searches. These same strengths also apply to data warehousing environments where MySQL scales up into the terabyte range for either single servers or scale-out architectures. Other features like main memory tables, B-tree and hash indexes, and compressed archive tables that reduce storage requirements by up to eighty-percent make MySQL a strong standout for both web and business intelligence applications.

    6. Strong Data Protection

    Because guarding the data assets of corporations is the number one job of database professionals, MySQL offers exceptional security features that ensure absolute data protection. In terms of database authentication, MySQL provides powerful mechanisms for ensuring only authorized users have entry to the database server, with the ability to block users down to the client machine level being possible. SSH and SSL support are also provided to ensure safe and secure connections. A granular object privilege framework is present so that users only see the data they should, and powerful data encryption and decryption functions ensure that sensitive data is protected from unauthorized viewing. Finally, backup and recovery utilities provided through MySQL and third party software vendors allow for complete logical and physical backup as well as full and point-in-time recovery.

    7. Comprehensive Application Development

    One of the reasons MySQL is the world’s most popular open source database is that it provides comprehensive support for every application development need. Within the database, support can be found for stored procedures, triggers, functions, views, cursors, ANSI-standard SQL, and more. For embedded applications, plug-in libraries are available to embed MySQL database support into nearly any application. MySQL also provides connectors and drivers (ODBC, JDBC, etc.) that allow all forms of applications to make use of MySQL as a preferred data management server. It doesn’t matter if it’s PHP, Perl, Java, Visual Basic, or .NET, MySQL offers application developers everything they need to be successful in building database-driven information systems.

    8. Management Ease

    MySQL offers exceptional quick-start capability with the average time from software download to installation completion being less than fifteen minutes. This rule holds true whether the platform is Microsoft Windows, Linux, Macintosh, or UNIX. Once installed, self-management features like automatic space expansion, auto-restart, and dynamic configuration changes take much of the burden off already overworked database administrators. MySQL also provides a complete suite of graphical management and migration tools that allow a DBA to manage, troubleshoot, and control the operation of many MySQL servers from a single workstation. Many third party software vendor tools are also available for MySQL that handle tasks ranging from data design and ETL, to complete database administration, job management, and performance monitoring.

    9. Open Source Freedom and 24 x 7 Support

    Many corporations are hesitant to fully commit to open source software because they believe they can’t get the type of support or professional service safety nets they currently rely on with proprietary software to ensure the overall success of their key applications. The questions of indemnification come up often as well. These worries can be put to rest with MySQL as complete around-the-clock support as well as indemnification is available through MySQL Network. MySQL is not a typical open source project as all the software is owned and supported by MySQL AB, and because of this, a unique cost and support model are available that provides a unique combination of open source freedom and trusted software with support.

    10. Lowest Total Cost of Ownership

    By migrating current database-drive applications to MySQL, or using MySQL for new development projects, corporations are realizing cost savings that many times stretch into seven figures. Accomplished through the use of the MySQL database server and scale-out architectures that utilize low-cost commodity hardware, corporations are finding that they can achieve amazing levels of scalability and performance, all at a cost that is far less than those offered by proprietary and scale-up software vendors. In addition, the reliability and easy maintainability of MySQL means that database administrators don’t waste time troubleshooting performance or downtime issues, but instead can concentrate on making a positive impact on higher level tasks that involve the business side of data.

    Have a look at the MySql website for more information. There are also a few useful articles on the Independent Digital website and elsewhere on the web. With a user-base of more than 11 million, you can’t go wrong if you choose MySql. Best of all, it’s FREE!

    MDB2 Users tutorial - using results

    24 April, 2007 (10:36) | MySQL, PEAR | By: clive

    If you’re new to MDB2 or maybe even if you’re a bit rusty as to which method to use to get the results that you want out of the database, then this tutorial is for you. It’s also a reference tool so you’ll end up using it whenever you work with MDB2.

    You run the index file and a drop-down list of all the methods you can use to get information out of your query result appears. Select one and a brief explanation of that method pops up together with an example of how it’s used. The using_results.php file contains all the code with comments for you to browse through. All in all a very useful tutorial and reference tool. I will be adding to this tutorial regularly until eventually it will contain all the methods applicable to the mdb2 package (configuration, connecting, querying, using results, etc). So check back here regularly to keep abreast of developments.

    Download the tutorial here

    PEAR:MDB2 tutorial for getting started and using it

    18 April, 2007 (13:05) | MySQL, PEAR | By: clive

    I thought I’d put together this tutorial for those of you, like me, would like to use something to make life easier when using databases but are somewhat overwhelmed when first dabbling with database abstraction layers such as PEAR:mdb2. Fear not, mdb2 is pretty easy to use once you know how. And that is the point of this tutorial, to show you how to use mdb2 without too much hassle.

    This is not a tutorial in the sense that I ramble on and give snippets of code which eventually leaves you totally confused. Rather I have created a package of files covering the main aspects of mdb2. Each file has all the code commented to make understanding the code very easy. So for example the create_db.php file has all the code necessay to create a database. You actually run the code and enter a database name and the database is created. Here’s a sample from the code:

    //create the data source name
    $dsn = ‘mysql://root:sasha@localhost’;//database name is omitted as it will be added from your input
    //do the connection thing
    $mdb2 =& MDB2::connect($dsn);
    //if there is an error
    if (PEAR::isError($mdb2))
    {
    die($mdb2->getMessage());
    }
    // loading the Manager module
    $mdb2->loadModule(’Manager’);

    By working through the code and acompanying comments, you will quickly see how easy it is to use mdb2. Naturally you can tweek the code and use it as you please.

    The files in the package cover the following:

  • Creating a database
  • Deleting a database
  • Creating a table
  • Deleting a table
  • Creating an index
  • Deleting an index
  • Creating sequences
  • Deleting sequences
  • Creating constraints
  • Deleting constraints
  • Inserting records
  • Searching records
  • You can get the file package here

    Using PEAR:AUTH and PEAR:MDB2 to secure your website

    23 March, 2007 (15:07) | MySQL, PEAR | By: clive

    It’s easy and secure.

    Requirements


    It is assumed that you have a working knowledge of HTML and php and have the necessary PEAR packages installed. You will also need a database (I have used MySql here but you can use others, just make sure that you then install the necessary MDB2 driver for that database). The PEAR packages should include:

  • Auth - the authentication package
  • MDB2 - the database abstraction layer
  • MDB2_Driver_mysql - the MySql MDB2 driver
  • HTML_Quickform - which creates and processes HTML forms
  • HTML_Common2 - the abstract base class for HTML classes

  • The basic structure of the website


    I used the Andreas02 template created by Andreas Viklund for this tutorial. He has some greate templates, some of which are open source.

    This imaginary website consists of 4 pages, the main home or index page, a login page, a register page and a members page. I used the same HTML template (andreas02) for all the pages.

    Navigational structure


    The index page has a sidebar containing the login and register links. The user must register before he can log in. The login link takes the user to the login page where, on successfully logging in, the user is taken to the members page. The members page has a logout button which logs the user out and returns him to the index page. The register link takes the user to the register page. Successful registration takes the user to the login page. A failed registration takes the user back to the registration page and displays an error message. A failed login returns the user to the index page and displays an error message.

    The Index page


    This is basically the home page of the website and consists almost entirely of HTML. It is the standard Andreas02 template with a few php lines and 2 added links. The file is then saved as a php file.

    The Login page


    The login page uses the same Andreas02 template and includes some php code. This code displays a Quickform login form consisting of the username and password text boxes and a login button. On completing the form, the form is processed and the username and password are compared to those stored in the database. If a match is made, the user is redirected to the members page. If no match is made, the user is returned to the login page with an error message included in the URL. The login page uses Quickform to create and process the login form and MDB2 to connect to and query the database. It also uses Auth to authenticate the user.

    The Register page


    The register page uses the Andreas02 template and includes some php code. The code displays a Quickform consisting of the username and password text boxes and the register button. On completing the form, the form is processed and the database is querried to see whether the username exists. If it does then the user is returned to the register page with an error message stating that the user exists. If the user does not exist then the username and password are added to the database and the user is redirected to the login page. If the data is not added successfully to the database, then the user is redirected to the register page.

    The Members page


    The members page uses the Andreas02 template and includes some php code. The code displays a Quickform logout button. When clicked, it destroys cookies, sessions, flushes the output buffer, logs the user out and directs him to the index page.

    Download the file containing all the files for the imaginary website here

      Secure_site.zip (34.1 KiB, 89 hits)

    . Useful to analyse the use of the php code (commented) which shows how to use Auth, MDB2 and Quickform within a website.

    Hope you find this short tutorial useful. Let me know if you need any help using Auth, Quickform, MDB2 or Flexy.

    MySQL: query optimization

    1 December, 2006 (10:43) | MySQL | By: clive

    When you query any database, you expect the result to be displayed immediately. Any delay is frustrating and in some cases can cause you to toss your laptop out of the window. You?d probably follow, either voluntarily or assisted, if the information delivered by the query was erroneous. It is therefore safe to say that accuracy and speed are high priority items when it comes to using databases.

    The following serves to briefly highlight some of the options available when trying to squeeze that extra bit of speed out of your queries. It is not exhaustive and I recommend that you consult the manual for more detailed usage of the options mentioned here.

    MySQL achieves speed and accuracy firstly, and most importantly, through indexing. (Have a look at the article MySQL: Indexes). Further nanoseconds can be sliced off through correct formulation of the query. Somewhere deep inside the MySQL black box lurks the query optimizer, which tirelessly analyses each and every query to determine the fastest route to the result that you are seeking. The optimizer examines the query then looks for an index. There are a number of things that you can do to speed things up:

    ? Try to compare columns of the same data type. For example, CHAR (10) is the same as CHAR (10) but not the same as CHAR (32).
    ? Compare columns that don?t have any calculations that first have to be resolved before the comparison can be made. For example:
    ?WHERE column_name < 12 as opposed to
    ?WHERE column_name * 22 < 12.

    The optimizer can make use of the index in the first option but not in the second. Here each row of the column has to first be multiplied by 22 before the comparison can be made.

    ? Don?t use wildcards at the beginning of a string comparison. The optimizer uses the beginning of the string to search the index.
    ? Help the optimizer choose the order in which it scans the tables. (It decides on the order but like any machine, it can be wrong). You can also help it choose a particular index (see below).
    ? Test alternative forms of the query. Try running these a few times and see which ones are faster. (Remember that data is cached so the second time a query is run, it may read directly from the cache. To remedy this, run the alternatives a few times).
    ? Do not abuse the automatic type conversion ability of MySQL. If possible, compare numbers to numbers and strings to strings.
    ? Use EXPLAIN to see the effects of writing the query differently.

    Overriding optimization

    You can override the optimizer?s table-join order and force it to use a particular order by using STRAIGHT_JOIN. Similarly you can force the optimizer to use a particular index by using FORCE INDEX and USE INDEX.

    When choosing data types for your columns:

    ? Try and use the shortest possible data types (have a look at the articles: MySQL: choosing data types and MySQL data types: numeric) for the columns. This will result in smaller tables and shorter indexes that will process faster. Also try and choose an optimal row storage format for your storage engine. The fixed-length format is preferred for the MyISAM (have a look at this article MySQL storage engines: MyISAM) engine as it is faster. The downfall is that fixed-length values take up more storage space. Generally speaking, the variable formats such as VARCHAR, require periodic de-fragmentation of the files because of the manner in which the data blocks are stored (non-contiguously), especially the more updates and deletes are run.
    ? Sometimes you may not have a choice and will be forced to use variable length data types. For example, any strings larger than 255 characters will have to be stored as a variable length data type. The storage engine also determines which data type to use. The MEMORY storage engine only stores tables as fixed data types.

    Try testing the performance of queries using either of the formats to see which is faster.

    ? Define columns to be NOT NULL as this uses less space and is faster.
    ? Consider converting string columns with low cardinality to ENUM columns, which are processed faster.
    ? Use PROCEDURE ANALYSE ( ) to display the characteristics of the data in the selection clause. It may reveal a more efficient data type for this column.
    ? Use OPTIMIZE TABLE to improve the performance of tables subject to fragmentation. Fragmentation slows down performance and is particularly common in columns containing variable length data types such as BLOB and TEXT. OPTIMIZE TABLE only works for MyISAM tables. With other engines you can dump the table, drop it then recreate it using the dumped file.
    ? Use client applications to pack data values not easily represented in a standard table structure or data that change over time, into a BLOB or TEXT data type. The benefit of this is that most of the processing is handled by the client application and MySQL only has to deal with a single retrieval or insert.
    ? Use synthetic indexes for exact matches. You can create your own or use MD( ), SHA( ) or CRC32( ). This is particularly useful for BLOB and TEXT types where it is much faster to search the index of these data types than to search the columns themselves.
    ? Put large TEXT and BLOB columns into a separate table. This allows the other tables to contain fixed format columns, which reduces fragmentation and increases performance. You are then able to do SELECT * queries without having to search the large BLOB and TEXT tables. It also reduces network traffic (low data throughput).

    Loading data efficiently

    Try the following to speed up inserting and updating records:

    By using bulk or batch loading, you reduce the number of times the key cache is flushed (once per batch as opposed to once per row). The result is faster data loading.
    ? Use LOAD DATA rather than INSERT as it loads the data in bulk, resulting in the server only having to parse and interpret one statement. Also the key cache is only flushed once. Using LOAD DATA without LOCAL is also faster as the server reads the file directly from the server disk rather than with LOCAL where the client sends the file across the network which is a lot slower.
    ? When using INSERT, try and insert multiple rows at once by putting the values in parenthesis, for example:

    INSERT?.values (?..), (?..), ?..;

    This results in fewer statements and less key cache flushing. For transactional storage engines (InnoDB and BDB) use INSERT within a single transaction:

    START TRANSACTION;
    INSERT INTO table_name?..;
    INSERT INTO table_name?..;
    INSERT INTO table_name?..;
    COMMIT;

    For non-transactional engines (MyISAM and MEMORY) write lock the table, issue your INSERT statements then unlock the tables. Both these options only flush the key cache once.
    ? To reduce the number of key cache flushes when using MyISAM tables, you can use the DELAYED_KEY_WRITE option where data rows are immediately written to the file but the cache is only flushed occasionally.
    ? Not having an index also speeds things up as an index has to be modified after each record is loaded.
    ? In MyISAM tables you can create the tables without indexes, load the data then create the indexes. It is faster to create all the indexes at once. In the same vain, you can drop or deactivate existing indexes, load the data then reactivate or rebuild them.
    ? Shorter SQL statements improve performance as there is less parsing for the server to do and less data throughput across the network.
    ? You can also use compressed protocols to reduce data throughput but bear in mind than any compression protocol is processor hungry.

    Scheduling and locking issues

    You can control the way MySQL handles multiple queries. The default order of priority is:

    ? Writes have a higher priority than reads.
    ? Writes to tables occur one at a time in the order they arrive.
    ? Multiple simultaneous reads from a table are possible.
    The MyISAM and MEMORY engines use table locks to control multiple queries. This is done automatically by the server and can be overridden by issuing explicit LOCK TABLES and UNLOCK TABLES statements. The write lock allows exclusive access to one writer. No other reads or writes are permitted. The read lock is not exclusive as it allows other readers but not writers.

    The LOW_PRIORITY keyword allows you to change the scheduling for DELETE, INSERT, LOAD DATA, REPLACE and UPDATE statements in MyISAM and MEMORY tables. It lowers the priority of these statements. You can set the default at server start-up for all statements to be LOW_PRIORITY. This can be overridden for individual statements by using the INSERT HIGH_PRORITY option. HIGH_PRIORITY allows you to change the scheduling for SELECT and INSERT statements to allow them to take priority.

    The DELAYED keyword can be used with INSERT and REPLACE statements when working with MyISAM and MEMORY tables. It puts all INSERT and REPLACE statements in a queue, which is then executed once the server is free. The only problem is that the queue resides in memory and will be lost in the event of a crash.

    Concurrent inserts

    The MyISAM and MEMORY engines use table-level-locks, the BDB engine uses page-level-locks and InnoDB uses row-level-locks to manage concurrent inserts. Locking at the row level allows more concurrent users as long as they do not access the same row. MyIsam tables are very fast for retrieval but slow down considerably when there are mixed retrievals and inserts because of the table locking strategy. The whole table is locked for each insert or update effectively denying access to all other clients. The BDB and InnoDB engines allow greater concurrency and speed when handling these mixed statements because of their lower level locking strategy. Only the page or row is locked which still allows other clients access to the rest of the table. The downfall of these two locking strategies is that deadlock (one client has a lock that another needs before they can continue) can occur. If it does, the server has to abort one of the transactions to allow the other to continue. This problem does not happen with the table-lock strategy as it allows only one client at a time.

    Hopefully this article has touched on a few of the options available to tweak your MySQL queries so as to get that extra bit of speed out of the beast.

    MySQL: naming rules

    17 November, 2006 (14:00) | MySQL | By: clive

    SQL or Structured Query Language is the language used to communicate with the MySQL server. Like any language it has it’s rules governing how you use it.

    When refering to databases, tables, columns, indexes and aliases, you will have to obey the following:

    • Identifier characters - can be any alphanumeric character (as long as it’s in the default character set) plus ‘_’ and ‘$’. They can also start with any legal character (legal to be included in the identifier), even a digit. Identifiers can be quoted within ‘`’ (backtick or `), which allows the use of any character except the backtick .
    For database and table names (indentifiers), you can’t use the ‘.’ character, nor the pathname separator (’/', ‘\’). Your operating system may also have other constraints that you need to take note of.Column and table names (aliases) are pretty much unrestricted.
    • Identifier length - identifiers can be up to 64 characters long.
    • Identifier qualifiers - sometimes to avoid confusion and especially when the context does not indicate which database, table or column is referred to, it may be necessary to qualify which database, table or column a particular identifier refers to.
      • With databases - just refer to it’s name
      • With tables - either qualify both database and table (…FROM database_name.table_name) or the table_name by itself will use the default database. You can’t use the table_name unless a database has been selected.
      • With columns - either:
        • fully qualify the identifier: database_name.table_name.column_name
        • partially qualify the identifier: table_name.column_name (using the default database)
        • unqualified identifier: column_name which refers to the currently active table and database.

    The case sensitivity rules vary depending on the operating system and part of the SQL statement.

    Always use the same lettercase format in all your queries, regardless of the lettercase you have chosen.
    • Keyword and function names are not case sensitive
    • Database and table name case sensitivity depends on the way the host operating system treats filenames. It is therefore a good idea to stick to a particular lettercase (either upper or lower).
    • Column and index names are not case sensitive.
    • Alias names can be in any lettercase or mixed. Just always use the same format whenever you refer to them.

    MySQL: Choosing Data Types

    16 November, 2006 (13:37) | MySQL | By: clive


    You can store anything in string types. But be aware that numbers are stored more efficiently using numeric columns. Numbers and strings are also treated differently when querying which will affect the result. You need to make the correct choice before creating the tables but it is possible to easily make changes after the tables have been created and the data loaded.

    Use ALTER TABLE to change the type

    Use PROCEDURE_ANALYSE( ) to get information about the tables columns including a suggested optimal type for that column.

    • What kind of values will the column hold?

    Think of what the values will be used for. Put numbers in numeric columns, strings in string columns, dates and times in temporal columns. For numbers, decide whether they are floating-point or integer type. What is the degree of accuracy required and what is the unit of measure? Use integers for whole measures and floating-point for fractionals. Bear in mind that the stored value does not necessarily have to be the display value. There are a number of MySQL functions that can be used to convert numbers.

    When working with monetary numbers remember that float and double are subject to rounding off. For accuracy of storage and retrieval, use DECIMAL, which is represented as a string and is not rounded off. The problem is that any calculations done are done using floating-point operations where round off can occur. The answer is to represent all money as cents in an integer column.

    • Do the values lie within a specific range?

    The range of your values determines the types that you can use. Use the smallest type possible. This will reduce storage requirements and speed up processing. Use the biggest type if you don?t know the range. You can always use ALTER TABLE later to change the type. The range of values also determines the attributes that you can use with that data type. For example UNSIGNED for positive numbers. String values have a maximum length of 255 characters, which determines whether TINYTEXT, CHAR or a longer TEXT type should be used. Try using the SET or ENUM data types for strings of fixed values. They are more efficient and take up less storage space.

    • What are the performance and efficiency issues?

      Data types affect the query performance in a number of ways:

          • Numeric operations are usually faster than string operations. So if possible, when working with strings, consider using SET and ENUM types which are processed as numbers.
          • Smaller types are processed faster than larger types. They use less storage space and consequently reduce disk activity. For fixed-sized data types, use the smallest possible type.
          • Fixed and variable length data types affect performance differently and varies depending on the storage engine. Fixed is preferable for MyISAM tables. It doesn?t matter with MEMORY tables, as values are stored internally as fixed-length. Similarly with InnoDB and BDB tables. Do tests using both and see whether there are any benefits in using either.
          • NOT NULL columns are processed faster and use less storage space.
      • The affects of Data Type choices

        Consider the choice of a data type within the context of storage space, performance issues, allowable attributes and the required values. Your data type needs to slip comfortably within all these issues.

    MySQL:Indexes

    9 November, 2006 (12:19) | MySQL | By: clive

    Queries can take a while before the result is displayed. Indexing helps to speed up the query process although there are other ways of helping to improve the speed of querying the database, such as:

    • Care in writing the query so as to best use the indexes
    • Proper configuration of the server software so as to efficiently handle queries from multiple clients
    • Proper configuration of the servers hardware to improve performance

    You will go a long way to improve the query speed if you first consider indexing your tables before trying any other means of increasing speed. Bear in mind that indexes can also hinder the performance of your database.

    Advantages of using indexes

    Consider a table consisting of 3 columns:

    1. ID number
    2. First name
    3. Last name

    If we were to search for a particular person, we would have to scan the whole table until we find a row where all three columns match our required name. This would be slow and inefficient. Suppose we indexed the first name column, this column would contain all the first names (some could be duplicates), sorted alphabetically. These would link to the corresponding record in the table (ID number, first name & last name). Now if we were to search for a particular first name, the scan would immediately jump to the first occurrence of that name (in the index), display the record (the complete record in the table) linked to that occurrence then move on repeatedly, each time displaying the relevant record, until there is no longer a match in the index (remember that the index is sorted alphabetically). This process is much faster as there is no need to scan through the whole table. The scan immediately jumps to the first occurrence of the match and then leaves once it reaches the last occurrence (due to the sorted index).

    This index would only work if we were searching the first name column. What if we were searching the ID or last name columns? No problem, it is possible to have many indexes in one table so we could also have an ID index as well as a last name index. Consider the speed differences in searching a database containing a million records where the id numbers are stored randomly as opposed to searching the index of this database where the indexed ID values are stored in numerical order. The indexed value would be found almost instantaneously.

    Indexes are also valuable when queries involve table joins. Single table queries are faster as the maximum number of values to be scanned per column is equal to the number of rows in the table. In a 3 column table containing 10 records, this would be a maximum of 30 records. Compare this to a query involving 3 tables, each with 3 columns and each having 10 records. With no index, we would have to scan all possible combinations: 10 x 10 x 10 or 1000 possible combinations! Naturally the larger the tables and the more records they contain, the more possible combinations and the longer the query will take. An index drastically reduces the number of combinations and consequently the processing time.

    MySQL uses indexes as follows:

    • speeds up searches for rows matching the WHERE clause or matching rows in a join
    • queries of the index using MIN( ) and MAX( ) functions to find the largest and smallest values are solved without examining each row
    • uses index to sort or group quickly using ORDER BY and GROUP BY clauses
    • can read the queried value directly from the index without having to scan the columns

    Disadvantages of using indexes

    The advantages far outweigh the disadvantages but bear the following in mind when creating indexes:

    • Indexes speed up retrieval but slow down inserts, deletes and updates of values in indexed columns. These operations all involve writing, not only the column values but also in the index. Consequently the more indexes there are, the more changes need to be made and naturally more time is needed for these changes.
    • Indexes take up disk space. This may be a problem in certain of the MySQL storage engines where table size is an issue (MyISAM and BDB) while the InnoDB storage engine can be configured to use multiple files and so is not limited by the operating systems file-size limit.

    Choosing indexes

    • Index columns used for searching, sorting and grouping. Don’t index columns used for display. The best columns are those appearing in the WHERE, ORDER BY or GROUP BY clauses as well as columns named in join clauses. So in the following:

    SELECT column1 from table1 LEFT JOIN table2 ON table1.column2 = table2.column3 WHERE column4 = searchTerm;

    column2, column3 and column4 are candidates for indexing.

    • Column cardinality. This is the number of unique values contained in a column. Indexes work best when there is a high cardinality. Put another way, the more unique values there are (fewer duplicates) the better that column will be for indexing. Consider the ID number column of the previous example. Here there are no duplicates, only unique values. This column will be ideal for indexing. On the other end of the scale may be the first names column. Here there will probably be a number of duplicate names (fewer unique values) and a lower cardinality compared to the ID column.
    • Index short values. Use smaller data types as these improve indexing:
      • shorter values are easier to compare
      • results in smaller indexes
      • enables more keys to be stored in memory and therefore fewer disk reads

    For clustered indexes (data rows are stored with the primary keys) such as is used by the InnoDB and BDB storage engines, it is best to keep the primary key short. In other indexes (secondary indexes), the primary key is stored in each (secondary) index with the result that longer keys take up more space.

    • Index prefixes of strings. If possible indicate a prefix length when indexing strings. For example if a column has a data type of CHAR(200) and most of the unique values are within the first 10 characters, then index only these first 10 characters and not all 200. It will save space and improve performance. Data types such as CHAR, VARCHAR, BINARY, VARBINARY, BLOB and TEXT all lend themselves to indexing the prefixes.
    • Leftmost prefixes. A 3 column composite index is actually 3 indexes. The leftmost columns in this index are the leftmost prefix. The index can only be used if the leftmost prefix is involved. So in a 3-columned composite index, ID, first name and last name the leftmost prefix would be:
      • ID
      • ID, first_name
      • ID, first_name, last_name
      • you cannot use first_name or last_name as an index as they do not contain the leftmost prefix
    • Too many indexes. Remember that each additional index uses space and affects performance. Also, MySQL may not choose the correct index when there are too many.
    • Match index types to types of comparisons performed.
      • Hash indexes are good for exact-match comparisons (= or <=>) and poor for range values (BETWEEN 100 AND 150). Hash index is the default for MEMORY tables.
      • B-tree indexes are good for exact or range-based comparisons and LIKE pattern matches where the pattern begins with a literal string and not a wildcard.
    • The slow query log identifies badly performing queries. The slow query log can identify queries that may benefit from indexing. View the log either in text form or by using mysqldumpslow. Duplicate queries here will indicate a need for a query rewrite. The only problem with the slow query log is that a busy server causes more queries to end up here than a not so busy server.

    Remember that indexing is just one way of speeding up the query process. Others involve the choice of storage engine, data types, etc.

    See also the article on configuring the MyISAM storage engine as well as the article on the MyISAM storage engine.

    MySQL: the basics

    8 November, 2006 (11:48) | MySQL | By: clive

    Structure of a database

    MySQL is a relational database management system or RDBMS for short.

    • R for relational - data from one table is joined to data from another table to provide a unique piece of information. The data from different tables have a relationship, something in common.
    • DB for database - this is the storage house for the data.
      • a piece of information is stored in a column
      • a record is a complete piece of information stored across a number of columns, each containing a piece of the record
      • a number of columns make up a table
      • each row in that table is a record
      • a number of tables make up database

    A database, undergraduates, contains 2 tables, personal_details and courses. Personal_details has 3 columns, student_number, first_name and last_name. Courses has 2 columns, english and maths. One record is stored in the database. In tabular form it would look like this:

    undergraduates

    personal_details

    courses

    student_number

    first_name

    last_name

    english

    maths

    1002345

    Jack

    Sprat

    75

    10

    • MS for management system. This is the software that allows you to insert, edit and delete the records.

    The query language

    SQL or structured query language is the language used to communicate with MySQL. It is a standard database language used by most databases.

    MySQL architecture

    MySQL is in essence two programs operating on 2 levels:

    • mysqld, which is the MySQL server, runs on the machine where the database is stored. It access the database in response to requests received from the client
    • the client program interfaces with the client and the server, issuing requests and receiving results

    There are many client programs, each with a specific purpose. mysql Is the most common. mysqldump A backup program and mysqladmin are both administrative clients. MySQL also has a client-programming library allowing for 3rd party programs in a number of languages, including C, Perl and PHP.

    The benefits of a client/server architecture

    • The server controls access and in so doing determines the order in which requests are performed
    • It allows access via a network and has a security system to protect the database

    Of note is the fact that MySQL refers to the entire database management system while mysql refers to a particular client program.