Thursday, May 15, 2008

Learning

Wow, what a blast these past three days have been!  We had a consultant in from MySQL, and I learned hoards of new tricks, tips, and techniques.  Here are a few:
  • You can store images on the filesystem based on their PK, to save having a field with the path.  For example, image.id = 15579, image.path = images/1/5/5/7/9.png (or whatever level of separation you find appropriate).
  • EXT3--though it can be tuned--is not an ideal filesystem for databases.  XFS is much better.  (He didn't give specific reasons, and I don't particularly care to look into it.)
  • QCache_hits should be much higher than QCache_inserts.  (show with "show global status like 'qc%'")  If it's not, you need to hard-code which queries use the cache and which don't.  Global status can also tell you about the number of locks that are occurring, which you may also wish to tweak.
  • query_cache_type values are: 0 = off, 1 = on, 2 = on demand
  • QCache hits are never logged.  That would take more time.  So if you want to really see how often queries are running, you have to turn of caches.
  • 400M is a reasonable limit for query cache.
  • Stop apache from forking: this is what brings so many sites down.  Instead, figure out how many concurrent processes your box can really handle, and then set the number of connections to this limit. If this is exceeded, the OS will queue additional requests, and keep the system from dying.  100 is a reasonable place to start.
  • MyISAM tables are fine if you're read-only (or nearly so): they will be faster.  But InnoDB is much, much faster, since they store the table completely in memory.  Of course, this means you have to have enough memory to fit the damned thing.  ; )  ...And, of course, there are caveats about very large fields (like blobs, texts, and varchars with large average size).  Shrinking your data as much as possible is important with large datasets... even one unused byte becomes a Meg with a million rows!
  • "show innodb status\g" is a tool for checking InnoDB performance.
  • strace shows all the system calls used for a command (linux).  Very cool!  Not sure how I hadn't seen this one before...
  • Sharding is what massive scaling is all about... but that's a topic you'll have to look up on the web, since there's so much to it.  ; )

No comments: