Another day, another programming assessment test. This time I was asked to generate some random data, then examine them to get their data type. Practically it is not a very difficult thing to do and I could probably complete it in fewer lines. I am pretty sure there are better ways to do this, as usual though.
So in short the generator is supposed to generate a 10MB (assuming MB here means 10^6 instead of 2^20, which is apparently now called a Mebibyte, MiB) file. The file would contain multiple lines, and each line is expected to have 4 fields. These four fields are expected to be one of each in random order.
- alphanumeric string
- alphabetical string
- real number
- integer
So this is the generator part of the program,
#!/usr/bin/env php
<?php
define('OUTPUT_LIMIT_FILESIZE', 10 * pow(10, 6));
function generator($file_path_output) {
$output_writer = output_get_writer(file_writable_to_path($file_path_output));
do {
$output_writer(array(
string_get_builder(array(array(65, 90), array(97, 122)), rand(1, 20)),
string_get_builder(array(array(48, 57), array(65, 90), array(97, 122)), rand(1, 20), rand(0, 10), rand(0, 10)),
function() {
return rand();
},
function() {
return (rand() / getrandmax()) * rand();
}));
} while(file_check_size($file_path_output) < OUTPUT_LIMIT_FILESIZE);
}
function string_get_builder(array $ascii_ranges, $string_size, $space_size_before = 0, $space_size_after = 0) {
return $builder = function($result = '') use(&$builder, $ascii_ranges, $string_size, $space_size_before, $space_size_after) {
return strlen($result) < $string_size ?
$builder(sprintf("%s%s", $result, character_get_random($ascii_ranges)))
: sprintf(
"%s%s%s",
implode('', array_fill(0, $space_size_before, ' ')),
$result,
implode('', array_fill(0, $space_size_after, ' '))
);
};
}
function character_get_random(array $ascii_ranges) {
return chr(call_user_func_array('rand', $ascii_ranges[array_rand($ascii_ranges)]));
}
function file_writable_to_path($file_path) {
try {
return fopen($file_path, 'w');
} catch(ErrorException $error) {
log_output_screen('Error in opening file.', TRUE);
}
}
function file_check_size($file_path) {
clearstatcache(TRUE, $file_path);
return filesize($file_path);
}
function output_get_writer($file_output) {
return function(array $content) use($file_output) {
shuffle($content);
$output_content = implode(array_map('call_user_func', $content), ',');
fprintf($file_output, '%s%s', $output_content, PHP_EOL);
fflush($file_output);
};
}
function log_output_screen($message, $debug = FALSE) {
file_put_contents(
$debug ? 'php://stderr' : 'php://stdout',
sprintf("%s DEBUG: %s%s", date('c'), $message, PHP_EOL));
}
error_reporting(E_ALL | E_STRICT);
set_error_handler(function($errno, $errstr, $errfile, $errline) {
// error was suppressed with the @-operator
if(0 === error_reporting()) {
return FALSE;
}
throw new ErrorException($errstr, 0, $errno, $errfile, $errline);
});
generator($argv[1]);
Nothing much to talk about the code, and there are a lot of things to be improved (for example better error handling). I probably need to find a way to ensure alphanumerical string is always generated instead of relying on the random generator to hopefully pick a number (RNG no like me in Diablo III so I don't see why it would be different here).
Next is the consumer part,
#!/usr/bin/env php
<?php
function line_reader($input_reader) {
$input_line = $input_reader();
while($input_line !== FALSE) {
array_map('item_print_type', $input_line);
$input_line = $input_reader();
}
}
function item_print_type($item) {
$type = 'alphanumeric';
if(strpos($item, '.') !== FALSE) {
$type = 'real numbers';
} else if(is_numeric($item) !== FALSE) {
$type = 'integer';
} else {
preg_match('/[^\d]*/', $item, $matches);
if(array_shift($matches) == $item) {
$type = 'alphabetical strings';
}
}
printf('%s - %s%s', $item, $type, PHP_EOL);
}
function input_get_reader($input_file) {
return function() use($input_file) {
$result = trim(fgets($input_file));
return feof($input_file) === FALSE ?
array_map('trim', explode(',', $result))
: FALSE;
};
}
function file_readable_from_path($input_file_path) {
try {
return fopen($input_file_path, 'r');
} catch(ErrorException $error) {
log_output_screen('Error in opening file.', TRUE);
}
}
function log_output_screen($message, $debug = FALSE) {
file_put_contents(
$debug ? 'php://stderr' : 'php://stdout',
sprintf("%s DEBUG: %s%s", date('c'), $message, PHP_EOL));
}
error_reporting(E_ALL | E_STRICT);
set_error_handler(function($errno, $errstr, $errfile, $errline) {
// error was suppressed with the @-operator
if(0 === error_reporting()) {
return FALSE;
}
throw new ErrorException($errstr, 0, $errno, $errfile, $errline);
});
line_reader(input_get_reader(file_readable_from_path($argv[1])));
So the obvious fun part is the data type deducing part. So I got lazy and start by determining if it is a real number by looking for a decimal point in the string. So if it isn't then check whether it is numeric (since it is not a real number, so if it is a number then it is an integer). Then I check if numbers exists at all in the string to determine whether it is an alphanumeric/alphabetical string. This part could use some serious optimization (better rules, better regex etc.).
Regardless of the outcome, I find it just as fun if not more compared to other quizzes (yes, yes I know I failed the previous one REALLY hard).