在使用 200 万行以上的数据库时,有什么方法可以优化查询 table
Is there any way to optimize the query while working with 2mil+ rows database table
我正在处理 Laravel 查询,该查询应计算过去 3 个月的最新月份数据并按周分组。我已尝试通过多种方式解决该问题,但它仍然超出了我的内存限制并且加载速度非常慢。
下面是我用来获得最终结果的当前代码 - 但这里也存在同样的问题。
知道如何优化数据的计数和分组吗?
$data['pois']['total'] = PoiLocation::whereYear('created_at', Carbon::now()->year)
->whereMonth('created_at', Carbon::now()->month)
->count();
$pois = PoiLocation::where('created_at', '>', (new Carbon)->subMonths(3))
->get()
->sortBy('created_at')
->groupBy(function ($collection) {
return Carbon::parse($collection->created_at)->isoWeek();
});
if ($pois->count()) {
foreach ($pois as $item => $value) {
$data['pois']['weeks'][$item] = $value->count();
}
} else {
$data['pois']['weeks'] = [];
}
if ($data['pois']['weeks']) {
$data['pois']['high'] = max($data['pois']['weeks']);
} else {
$data['pots']['high'] = 1;
}
protected $fillable = [
'store_id', 'name', 'address1', 'address2', 'city','state','zip_code', 'dma_desc', 'country', 'lat' ,'lon', 'target', 'is_verified', 'polygons', 'external_id', 'brandID', 'companyID'
];
protected $dates = ['created_at', 'updated_at'];
public $timestamps = true;
我认为你的主要问题在这里:
$pois = PoiLocation::where('created_at', '>', (new Carbon)->subMonths(3))->get()->sortBy('created_at')->groupBy(function ($collection) {
return Carbon::parse($collection->created_at)->isoWeek();
});
您将获取过去三个月内创建的每条记录,将它们加载到您的内存中,然后进行排序和分组。您应该在获取记录之前在数据库中进行排序和分组操作:
$pois = PoiLocation::where('created_at', '>', (new Carbon)->subMonths(3))->orderBy('created_at')->get();
这在数据库中按 created_at
排序。您的小组需要多考虑一下...事实证明,在数据库查询中对整个结果集进行分组有点棘手。
如果你只是想获得每周的计数,你可以使用类似的东西:
PoiLocation::select(DB::raw("count(*), WEEK(created_at) as week, YEAR(created_at) as year"))->groupBy(['week', 'year'])->get()
我认为正如@IGP 所建议的那样,您的目标可能不是真正获取整组数据,而只是获取指标。在这种情况下,在一年中按周推送一些操作(如上述计数)将有助于获取所需的数据,而无需遍历内存中的每条记录。
所以满足你原来的要求:
最近三个月创建的项目总数(您的原始查询没问题):
$totalPastThreeMonths = PoiLocation::whereYear('created_at', Carbon::now()->year)
->whereMonth('created_at', Carbon::now()->month)
->count();
最近三个月每周的项目数:
$itemsPerWeek = PoiLocation::select(DB::raw("count(*) as count, WEEK(created_at) as week"))->whereYear('created_at', Carbon::now()->year)
->whereMonth('created_at', Carbon::now()->month)->groupBy('week')->get()
物品数量最多的一周:
$itemsPerWeek->sortBy('count')->last();
或者:
$itemsPerWeek->max('count');
我想这些只会让你到达你想去的地方。
您可以使用 LazyCollections
。这应该会大大减少您的内存使用量。
$pois = PoiLocation::query()
->where('created_at', '>', Carbon::now()->subMonths(3))
->orderBy('created_at') // sort in DB instead of wasting more memory doing the sorting.
->cursor() // don't load every model in memory
->remember() // don't repeat the query (if this line wasn't here, the query would be made 3 times. 1- $poi->all(), 2- ($poi->max() !== null) 3- $poi->max() )
->groupBy(function (Poilocation $poiLocation) { // param here is not a collection, it's a Poilocation. Type hint is completely optional
return $poiLocation->created_at->isoWeek(); // $poiLocation->created_at should already be a Carbon instance because of Eloquent magic.
})
->map->count();
$data['pois']['weeks'] = $poi->all();
$data['pois']['high'] = ($poi->max() !== null) ? $poi->max() : 1;
这也简化了你的逻辑。
$poi->all()
将 return 一个带有键的数组,无论是否为空。
$poi->max()
将 return collection 的 max()
。如果 collection 为空,它将 return null
。一个简单的三元运算符也可以处理您的那部分逻辑。
我正在处理 Laravel 查询,该查询应计算过去 3 个月的最新月份数据并按周分组。我已尝试通过多种方式解决该问题,但它仍然超出了我的内存限制并且加载速度非常慢。 下面是我用来获得最终结果的当前代码 - 但这里也存在同样的问题。
知道如何优化数据的计数和分组吗?
$data['pois']['total'] = PoiLocation::whereYear('created_at', Carbon::now()->year)
->whereMonth('created_at', Carbon::now()->month)
->count();
$pois = PoiLocation::where('created_at', '>', (new Carbon)->subMonths(3))
->get()
->sortBy('created_at')
->groupBy(function ($collection) {
return Carbon::parse($collection->created_at)->isoWeek();
});
if ($pois->count()) {
foreach ($pois as $item => $value) {
$data['pois']['weeks'][$item] = $value->count();
}
} else {
$data['pois']['weeks'] = [];
}
if ($data['pois']['weeks']) {
$data['pois']['high'] = max($data['pois']['weeks']);
} else {
$data['pots']['high'] = 1;
}
protected $fillable = [
'store_id', 'name', 'address1', 'address2', 'city','state','zip_code', 'dma_desc', 'country', 'lat' ,'lon', 'target', 'is_verified', 'polygons', 'external_id', 'brandID', 'companyID'
];
protected $dates = ['created_at', 'updated_at'];
public $timestamps = true;
我认为你的主要问题在这里:
$pois = PoiLocation::where('created_at', '>', (new Carbon)->subMonths(3))->get()->sortBy('created_at')->groupBy(function ($collection) {
return Carbon::parse($collection->created_at)->isoWeek();
});
您将获取过去三个月内创建的每条记录,将它们加载到您的内存中,然后进行排序和分组。您应该在获取记录之前在数据库中进行排序和分组操作:
$pois = PoiLocation::where('created_at', '>', (new Carbon)->subMonths(3))->orderBy('created_at')->get();
这在数据库中按 created_at
排序。您的小组需要多考虑一下...事实证明,在数据库查询中对整个结果集进行分组有点棘手。
如果你只是想获得每周的计数,你可以使用类似的东西:
PoiLocation::select(DB::raw("count(*), WEEK(created_at) as week, YEAR(created_at) as year"))->groupBy(['week', 'year'])->get()
我认为正如@IGP 所建议的那样,您的目标可能不是真正获取整组数据,而只是获取指标。在这种情况下,在一年中按周推送一些操作(如上述计数)将有助于获取所需的数据,而无需遍历内存中的每条记录。
所以满足你原来的要求:
最近三个月创建的项目总数(您的原始查询没问题):
$totalPastThreeMonths = PoiLocation::whereYear('created_at', Carbon::now()->year)
->whereMonth('created_at', Carbon::now()->month)
->count();
最近三个月每周的项目数:
$itemsPerWeek = PoiLocation::select(DB::raw("count(*) as count, WEEK(created_at) as week"))->whereYear('created_at', Carbon::now()->year)
->whereMonth('created_at', Carbon::now()->month)->groupBy('week')->get()
物品数量最多的一周:
$itemsPerWeek->sortBy('count')->last();
或者:
$itemsPerWeek->max('count');
我想这些只会让你到达你想去的地方。
您可以使用 LazyCollections
。这应该会大大减少您的内存使用量。
$pois = PoiLocation::query()
->where('created_at', '>', Carbon::now()->subMonths(3))
->orderBy('created_at') // sort in DB instead of wasting more memory doing the sorting.
->cursor() // don't load every model in memory
->remember() // don't repeat the query (if this line wasn't here, the query would be made 3 times. 1- $poi->all(), 2- ($poi->max() !== null) 3- $poi->max() )
->groupBy(function (Poilocation $poiLocation) { // param here is not a collection, it's a Poilocation. Type hint is completely optional
return $poiLocation->created_at->isoWeek(); // $poiLocation->created_at should already be a Carbon instance because of Eloquent magic.
})
->map->count();
$data['pois']['weeks'] = $poi->all();
$data['pois']['high'] = ($poi->max() !== null) ? $poi->max() : 1;
这也简化了你的逻辑。
$poi->all()
将 return 一个带有键的数组,无论是否为空。$poi->max()
将 return collection 的max()
。如果 collection 为空,它将 returnnull
。一个简单的三元运算符也可以处理您的那部分逻辑。