Printable
Version
 

FileMaker index type & size test across 100k records (& empty fields)

Every measurement of bytes was done after I performed "File Maintenance",
"Compact File" & "Optimize File" in FileMaker Pro 9 Advanced.

2 pronged experiment

    I appreciate this (perhaps) borders on lunacy, but I am contemplating building a pretty large system and wish to start from a solid base, thus having all facts and figures b4 me is helpful. Do certain ways of doing things have unacceptably high space costs or zero costs? I have found the results helpful, particularly in relation to the use of empty and generally redundant fields (allow me to explain below), and since I recorded the results I thought I may as well make them available.

  • to determine if key ID methodology significantly impacts index & file size

      I created 100k records for the test purposes in a single table with just one indexed field.

      The following tests were performed:

      1. 100k number field, serial 1-100k, indexed: 3,190,784 bytes
      2. 100k text field (using same numbers) indexed minimal: 3,821,568 bytes
      3. 100k ditto (using serial A0000001...) ditto: 4,661,248 bytes
      4. 100k ditto (using serial ABC00000001...) ditto: 5,713,920 bytes
      5. 100k as IV., but with the ID as a minimal indexed multi-key: 13,778,944 bytes - comme ca:
          ABC00000001
          ABC0000000
          ABC000000
          ABC00000
          ABC0000
          ABC000
          ABC00
          ABC0
          ABC
          AB
          A

      Conclusion?

        You'd have to have a lot of records for it to be a big issue, but stick to text fields for multi-key potential [and avoid danger of text->number accidental relational failure or is it vice-a-versa or both that fail?], just use normal numbers if you want to economise - little penalty on space for these... but overall if I used 10 million records with ABC00000001 format it'd only be 600MB or so of indexes, 300MB over the pure number equivalence, but no risk of mismatches and a few additional trix up your relational sleeve as well, and no need to think about wot index style to use (KISS). [Obviously the multi-key has to die!]

  • are redundant and empty indexed fields a size problem?

      This is of interest to me because I want to have an Account table that holds either Person or Organisation data. If it holds Person details then the Organisation field, and the _mfk_Organisation [minimally indexed] will be empty. If it holds Organisation details then the Title, First Name, Last Name, and {Fn, Ln & _mfk_FnLn} fields [minimally indexed] will be empty. Is this going to cause redundant space issues? So do I really need to think about separating these two types of Account into two different tables??? [Entity relationship nightmare to my mind!]

      The data and indexing requirements look like this...

      Person fields:

      Organisation fields:

      dummy data

        I filled it in an 80:20 [Person:Org] ratio with lots of dummy data - random numbers in number fields, and lorem ipsum junk text in text fields to approximate title, first name, surname and organisation fields. I measured total file size [16,818,176 bytes], and then split data in two, i.e. 80k records of Person and 20k records of Org [as below].

        Person dummy data:

        Organisation dummy data:

        Then I measured each [after compacting], and then chopped out the irrelevant fields and re-measured [again after compacting]:

        number of bytes after>>>  split up& removal of irrelevant fields
        Person data [79999 records]:13,504,51213,471,744
        Organisation data [20k records]:3,452,9283,399,680

      Conclusion:

        Empty fields and their indexes use next to no space. Exactly what I wanted to hear, so I will tear ahead with my single Account table: