Better Genetics Handling Part 2

We left off last time with explaining why changing our genetics handling was necessary and also set off on a programmatic approach to solving the issue for our most basic genetics strings where there was only a mother, father and an x or a / to denote what kind of relationship we knew about. However, as we all know from cannabis genetics, things aren’t always this easy.

Many times, our genetics strings are far more complicated. Here are a few examples:

(({221} x {669}) x {669}) x (({221} x {669}) x {669})
{329} / {114} / ({1934} / {2045})
({1927} / {2093}) / {137} / ???

In these cases, we have some heavier lifting to do. I’m taking them a chunk at a time. We’re doing the same thing we did for the simple examples when it comes to grabbing our data. First, I wanted to handle strains where we’ve got definite parent relationships and there was a parent listed that had more than one part. For example:

({207} x {246}) x {1799}
({1229} x {115}) x {115}

Here is what the code looked like for ensuring these were the types of records we were getting:

$values = array_count_values($strainParts);
if ($values['x'] > 1 && strpos($geneticsString, '(') !== false && strpos($geneticsString, '/') === false && strpos($geneticsString, '?') === false) {
    $matches = explode(')', $geneticsString);
    if (count($matches) == 3) {
        // Complicated
        continue;
    }

Here we can see there are no /’s which means we know what parents went where. In addition, we’ve got strains that have parents that are made of more than one strain. These are denoted by the parenthesis. It was important to me to find strains listed like that so that I could start to insert some new crosses as new strains in the database. So, for the first example listed above, here’s what I wanted the order of operations to be:

  1. Figure out whether the parens were on the mother side (left) or father side (right).
  2. Determine whether the parent made of two strains already existed in the database or not.
  3. Insert the parent made of two strains as a new strain in the database.
  4. Insert the correct genetics information for that new parent strain.
  5. Create our genetics record and use our newly created or already existing strain for the parent appropriately.

For step one, I sorta did a trick to determine whether it was a mother or a father with the ()’s. When exploding on the ) character, we can only get one of two kinds of results back:

var_dump(explode(')', '({strain1 x strain2}) x {strain3}'));
/*
array(2) {
  [0]=>
  string(20) "({strain1 x strain2}"
  [1]=>
  string(12) " x {strain3}"
}
*/
 
var_dump(explode(')', '{strain1} x ({strain2} x {strain3})'));
/*
array(2) {
  [0]=>
  string(34) "{strain1} x ({strain2} x {strain3}"
  [1]=>
  string(0) ""
}
*/

As you may or not have caught, if we don’t have a string length for array[1] we know that it was the father which was the parent with the parens. So, we’ve got code for this that we’ll use for variable variables. Here’s what the code looks like for that:

if (strlen($matches[1]) == 0) {
    $varName = 'fatherId';
    $otherVarName = 'motherId';
} else {
    $varName = 'motherId';
    $otherVarName = 'fatherId';
}

From there, I’ll let the comments in the code serve as a description for what happened to satisfy the rest of the steps we previously mentioned:

// Grab our genetics to check
$geneticsToCheck = substr($matches[0], strpos($matches[0], '('));
 
// Grab the other strain number that is in the genetics string
preg_match('/[0-9]+/', str_replace($geneticsToCheck, '', $geneticsString), $matches);
$$otherVarName = $matches[0];
 
// Explode our strains making up our paren'd strain
$parts = explode('x', str_replace(array('{', '(', '}', ' '), '', $geneticsToCheck));
if (count($parts) == 2) {
    // Check some stuffs!
    if (is_numeric($parts[0]) && is_numeric($parts[1])) {
        // See if this is the only strain where this strain exists
        $sql = 'SELECT strain_id 
               FROM strain_reports 
               WHERE LOWER(genetics_string) LIKE \'{' . $parts[0] . '} x {' . $parts[1] . '}\'';
        $result = $db2->query($sql);
 
        /**
          * If we only get one row back, we know that this combination of genetics already exists in the
          * as a strain and we want to use that strain id as the parent ID.
          */
        if (mysqli_num_rows($result) == 1) {
            $row = mysqli_fetch_assoc($result);
            $$varName = $row['strain_id'];
            $result->free();
        } else if (mysqli_num_rows($result) > 1) {
            // We've got two strains with the same genetics, not good. Need to check it manually later.
            $result->free();
            continue;
        } else {
            // This is a new strain for the database
            $result->free();
 
            // Grab the names of the strains that make up this cross to name our newly created strain
            $sql = "SELECT strain_id, name, display_name 
                   FROM strain_reports WHERE strain_id IN (" . implode(',', $parts) . ")";
            $result = $db2->query($sql);
            $names = array();
            while ($row = $result->fetch_assoc()) {
                $names[] = strlen($row['display_name'])?$row['display_name']:ucwords($row['name']);
            }
            $result->free();
 
            $name = $display_name = str_replace($parts, $names, str_replace(array('(', '{', '}'), '', $geneticsToCheck));
 
            // Insert our new strain into the DB
            $sql = "INSERT INTO strain_reports (name, display_name, genetics_string, active)
                   VALUES (?, ?, ?, 1)";
            $stmt = $db2->prepare($sql);
            $stmt->bind_param('sss', $name, $display_name, str_replace('(', '', $geneticsToCheck));
            $stmt->execute();
            $stmt->close();
 
            // Set our appropriate father or mother ID as our newly inserted strain
            $$varName = $db2->insert_id;
 
            // Insert the genetics information for this new strain
            insertMotherFather($db2->insert_id, $parts[0], $parts[1]);
            $newStrains++;
        }
 
        // Now we can insert our genetics information for our strain with our new or existing strain id
        insertMotherFather($strainId, $motherId, $fatherId);
    }
    $numUpdated++;
}

Checking our $numUpdated and $newStrains vars at the end told us that we processed another 54 strains and added another 40 which provided an additional 94 rows of genetics information in our DB as well.

So, we’ve added another 40 strains to what is already the largest cannabis strain database on the planet and provided genetic information for another 10% of the remaining strains that weren’t covered. We’ve only got another 400 strains to cover and we’ve got a great base of code to accomplish this. With a few modifications we’ll not only be able to handle our genetics strings with ???s in them, but also start handling the larger and more complicated ones.

There will definitely be some manual data entry and massaging as we get towards the end and are presented with only the more complicated strains, but until then, stay tuned for part 3 where we will cover as many strains as possible in a programmatic way.

If you liked this post or just want to keep up with the latest developments concerning Smokereports.com, please consider subscribing to my feed or following smokereports on Twitter.

May 12, 2011 · admin · No Comments
Posted in: development, genetics, list, mysql, php

Leave a Reply