即使存在唯一检查,唯一字符串生成器也会生成重复项
Unique string generator generates duplicates even though a unique check exists
我正在制作一个小的 class,它使用 gfycat wordlists 来生成唯一的字符串。
<?php
namespace Jamosaur\Randstring;
class Randstring
{
private $adjectives;
private $animals;
private $min;
private $max;
private $case;
private $maxLength;
private $string;
public $combinations = [];
private $first;
private $second;
public $adjective;
public $animal;
public $number;
/**
* Randstring constructor.
* @param null $case (ucwords, ucfirst)
* @param int $maxLength
* @param int $min
* @param int $max
*/
public function __construct($case = null, $maxLength = 100, $min = 1, $max = 99)
{
$this->case = $case;
$this->maxLength = $maxLength;
$this->min = $min;
$this->max = $max;
$this->adjectives = explode(PHP_EOL, file_get_contents(__DIR__.'/dictionaries/adjectives.txt'));
$this->animals = explode(PHP_EOL, file_get_contents(__DIR__.'/dictionaries/animals.txt'));
}
/**
* @param null $first
* @param null $second
*/
public function generateNumbers($first = null, $second = null)
{
$this->first = ($first) ? $first : mt_rand(0, count($this->adjectives) - 1);
$this->second = ($second) ? $second : mt_rand(0, count($this->animals) - 1);
$this->number = mt_rand($this->min, $this->max);
if (isset($this->combinations[$this->first.'.'.$this->second.$this->number])) {
$this->generateNumbers($this->first, $this->second);
}
$this->combinations[$this->first.'.'.$this->second.$this->number] = 1;
}
/**
* Generate a string.
*/
public function generateString()
{
$this->generateNumbers();
$this->adjective = $this->adjectives[$this->first];
$this->animal = $this->animals[$this->second];
switch ($this->case) {
case 'ucfirst':
$this->string = ucfirst($this->adjective.$this->animal.$this->number);
break;
case 'ucwords':
$this->string = ucfirst($this->adjective).ucfirst($this->animal).ucfirst($this->number);
break;
default:
$this->string = $this->adjective.$this->animal.$this->number;
break;
}
}
/**
* @return mixed
*/
public function generate()
{
$this->generateString();
if (strlen($this->string) > $this->maxLength) {
return $this->generate();
}
return $this->string;
}
}
我添加了一个签入以记录在 generateNumbers()
中创建的每个组合,它应该将每个组合存储在一个数组中。
我设置了一个小测试来生成 10000 个唯一字符串,仅用于性能测试,这是通过以下代码片段完成的:
$rand = new Jamosaur\Randstring\Randstring(null, 10);
for ($i=0; $i < 10000; $i++) {
$t[$i] = $rand->generate();
}
echo 'Unique Strings: '.count(array_unique($t)).'<br>';
echo 'Combinations: '.count($rand->combinations).'<br>';
运行这个,预计会有10000个不重复的字符串。
我运行测试了10次,结果如下:
唯一字符串:9998
组合:527879
唯一字符串:9999
组合:518899
唯一字符串:9999
组合:515290
唯一字符串:9999
组合:516193
唯一字符串:10000
组合:526652
唯一字符串:10000
组合:516049
唯一字符串:10000
组合:523217
唯一字符串:10000
组合:509729
唯一字符串:10000
组合:517236
唯一字符串:10000
组合:512270
这里的逻辑是否有缺陷?测试限制为 10 个字符的字符串,但测试表明至少有 10000 个唯一字符串。
您的代码中存在几个问题:
a) 对已用组合的记忆存在缺陷
$this->combinations[$this->first.'.'.$this->second.$this->number] = 1;
对于给定的 $this->first 这将碰撞
$this->second = 10, $this->number = 11 (=1011)
$this->second = 101, $this->number = 1 (=1011)
在 $this->second 和 $this->number 之间添加分隔符
b) 你的单词列表中可能有重复项
我下载了文件,例如"green" 一词在形容词
中重复
c) 仔细调试代码的递归(自引用)特征。
d) 函数 generateNumbers($first = null, $second = null) 中参数的用途是什么?
我正在制作一个小的 class,它使用 gfycat wordlists 来生成唯一的字符串。
<?php
namespace Jamosaur\Randstring;
class Randstring
{
private $adjectives;
private $animals;
private $min;
private $max;
private $case;
private $maxLength;
private $string;
public $combinations = [];
private $first;
private $second;
public $adjective;
public $animal;
public $number;
/**
* Randstring constructor.
* @param null $case (ucwords, ucfirst)
* @param int $maxLength
* @param int $min
* @param int $max
*/
public function __construct($case = null, $maxLength = 100, $min = 1, $max = 99)
{
$this->case = $case;
$this->maxLength = $maxLength;
$this->min = $min;
$this->max = $max;
$this->adjectives = explode(PHP_EOL, file_get_contents(__DIR__.'/dictionaries/adjectives.txt'));
$this->animals = explode(PHP_EOL, file_get_contents(__DIR__.'/dictionaries/animals.txt'));
}
/**
* @param null $first
* @param null $second
*/
public function generateNumbers($first = null, $second = null)
{
$this->first = ($first) ? $first : mt_rand(0, count($this->adjectives) - 1);
$this->second = ($second) ? $second : mt_rand(0, count($this->animals) - 1);
$this->number = mt_rand($this->min, $this->max);
if (isset($this->combinations[$this->first.'.'.$this->second.$this->number])) {
$this->generateNumbers($this->first, $this->second);
}
$this->combinations[$this->first.'.'.$this->second.$this->number] = 1;
}
/**
* Generate a string.
*/
public function generateString()
{
$this->generateNumbers();
$this->adjective = $this->adjectives[$this->first];
$this->animal = $this->animals[$this->second];
switch ($this->case) {
case 'ucfirst':
$this->string = ucfirst($this->adjective.$this->animal.$this->number);
break;
case 'ucwords':
$this->string = ucfirst($this->adjective).ucfirst($this->animal).ucfirst($this->number);
break;
default:
$this->string = $this->adjective.$this->animal.$this->number;
break;
}
}
/**
* @return mixed
*/
public function generate()
{
$this->generateString();
if (strlen($this->string) > $this->maxLength) {
return $this->generate();
}
return $this->string;
}
}
我添加了一个签入以记录在 generateNumbers()
中创建的每个组合,它应该将每个组合存储在一个数组中。
我设置了一个小测试来生成 10000 个唯一字符串,仅用于性能测试,这是通过以下代码片段完成的:
$rand = new Jamosaur\Randstring\Randstring(null, 10);
for ($i=0; $i < 10000; $i++) {
$t[$i] = $rand->generate();
}
echo 'Unique Strings: '.count(array_unique($t)).'<br>';
echo 'Combinations: '.count($rand->combinations).'<br>';
运行这个,预计会有10000个不重复的字符串。
我运行测试了10次,结果如下:
唯一字符串:9998 组合:527879
唯一字符串:9999 组合:518899
唯一字符串:9999 组合:515290
唯一字符串:9999 组合:516193
唯一字符串:10000 组合:526652
唯一字符串:10000 组合:516049
唯一字符串:10000 组合:523217
唯一字符串:10000 组合:509729
唯一字符串:10000 组合:517236
唯一字符串:10000 组合:512270
这里的逻辑是否有缺陷?测试限制为 10 个字符的字符串,但测试表明至少有 10000 个唯一字符串。
您的代码中存在几个问题:
a) 对已用组合的记忆存在缺陷
$this->combinations[$this->first.'.'.$this->second.$this->number] = 1;
对于给定的 $this->first 这将碰撞
$this->second = 10, $this->number = 11 (=1011)
$this->second = 101, $this->number = 1 (=1011)
在 $this->second 和 $this->number 之间添加分隔符
b) 你的单词列表中可能有重复项 我下载了文件,例如"green" 一词在形容词
中重复c) 仔细调试代码的递归(自引用)特征。
d) 函数 generateNumbers($first = null, $second = null) 中参数的用途是什么?