不断降低内存分配效率。代码示例
Constant decreasing memory allocation productivity. Code example
请帮忙看看下面代码性能不断下降的原因是什么。
如果您 运行 代码,您会看到它每秒打印出多次迭代。 10-15 秒后,每秒迭代次数比开始时小 x2。
一开始问题出在QT,后来改代码用std测试,问题重现。
#include <iostream>
#include <vector>
#include <memory>
#include <string>
#include <chrono>
using namespace std;
int64_t GetUsecTime()
{
return std::chrono::time_point_cast<std::chrono::microseconds>(std::chrono::high_resolution_clock::now()).time_since_epoch().count();
}
namespace IType {
enum T {
oI = 0,
oI_2 = 1,
};
}
class BaseClass
{
public:
BaseClass(IType::T t)
: m_type(t)
{}
virtual ~BaseClass() {}
IType::T type()
{
return m_type;
}
private:
IType::T m_type;
};
class InheritedClass : public BaseClass
{
public:
static shared_ptr<InheritedClass> CreateShared(const string& some)
{
//return shared_ptr<InheritedClass>(new InheritedClass(some));
return make_shared<InheritedClass>(some);
}
static shared_ptr<InheritedClass> CreateShared()
{
//return shared_ptr<InheritedClass>(new InheritedClass(some));
return make_shared<InheritedClass>();
}
InheritedClass(const string& some)
: BaseClass(IType::oI)
, someFiled(some)
{}
InheritedClass()
: BaseClass(IType::oI)
, someFiled("")
{}
void setField(const string& value)
{
someFiled = value;
}
private:
string someFiled;
};
class InheritedClass_2 : public BaseClass
{
public:
static shared_ptr<InheritedClass_2> CreateShared()
{
//return shared_ptr<InheritedClass_2>(new InheritedClass_2());
return make_shared<InheritedClass_2>();
}
InheritedClass_2()
: BaseClass(IType::oI_2)
{}
void setField(int value)
{
someFiled = value;
}
private:
int someFiled;
};
class someclass
{
public:
someclass()
{
int64_t dataSize(0);
int64_t oldDataSize(0);
int64_t time0(GetUsecTime());
int64_t time(time0);
for (int i = 0; i<8000000; i++)
{
// TEST CONSTRUCTORS
//auto instance=InheritedClass::CreateShared(); // No Problem...
auto instance = InheritedClass::CreateShared(""); // PROBLEM!! With this line, speed is constantly decreasing
//TEST CODE:
//instance->setField("just something"); // PROBLEM!! With this line, speed is constantly decreasing
V.push_back(instance);
auto instance_2 = InheritedClass_2::CreateShared();
instance_2->setField(i);
V.push_back(instance_2);
dataSize = i;
if (GetUsecTime() - time > 1000000)
{
time = GetUsecTime();
cout << "Processed: " << dataSize - oldDataSize << endl;
oldDataSize = dataSize;
}
}
cout << V.size() << " in " << (GetUsecTime() - time0) / 1000.0 << " ms" << endl;
}
private:
vector<shared_ptr<BaseClass> > V;
};
int main(int argc, char *argv[])
{
someclass s;
return 0;
}
QT中的代码运行和Visual studio。
下面 QT 程序的打印输出(从 220000 减少到 67000:
Processed: 31681
Processed: 196038
Processed: 234112
Processed: 229468
Processed: 216378
Processed: 227070
Processed: 198330
Processed: 211321
Processed: 197137
Processed: 151333
Processed: 167995
Processed: 168307
Processed: 163719
Processed: 153696
Processed: 110894
Processed: 143917
Processed: 137006
Processed: 129974
Processed: 127678
Processed: 124093
Processed: 124029
Processed: 123018
Processed: 118595
Processed: 116676
Processed: 115023
Processed: 73030
Processed: 111768
Processed: 110222
Processed: 103588
Processed: 106266
Processed: 105271
Processed: 105031
Processed: 102042
Processed: 100258
Processed: 99404
Processed: 98955
Processed: 97007
Processed: 95901
Processed: 93696
Processed: 91405
Processed: 91061
Processed: 90175
Processed: 87727
Processed: 87448
Processed: 85510
Processed: 84238
Processed: 41837
Processed: 76040
Processed: 75694
Processed: 82918
Processed: 81515
Processed: 80957
Processed: 79657
Processed: 80840
Processed: 79110
Processed: 78720
Processed: 76078
Processed: 76067
Processed: 75412
Processed: 75546
Processed: 74052
Processed: 73386
Processed: 69608
Processed: 66880
Processed: 68731
Processed: 70560
Processed: 68979
Processed: 69985
Processed: 70516
Processed: 68464
Processed: 67379
Processed: 67980
Processed: 67746
Processed: 67332
16000000 in 74537 ms
下面VisualSutio程序的打印输出(初始值较小,不太稳定,但没有减少):
Processed: 81138
Processed: 78107
Processed: 81158
Processed: 101733
Processed: 69418
Processed: 99900
Processed: 54649
Processed: 94161
Processed: 95660
Processed: 31477
Processed: 97066
Processed: 97588
Processed: 99001
Processed: 99554
Processed: 492
Processed: 99197
Processed: 100049
Processed: 99765
Processed: 100066
Processed: 97667
Processed: 93807
Processed: 100146
Processed: 99378
Processed: 99824
Processed: 98228
Processed: 97943
Processed: 99552
Processed: 100299
Processed: 99753
Processed: 90703
Processed: 98276
Processed: 99480
Processed: 99569
Processed: 99528
Processed: 99058
Processed: 98939
Processed: 97637
Processed: 99334
Processed: 99713
Processed: 99540
Processed: 99212
Processed: 99339
Processed: 98781
Processed: 40334
Processed: 98810
Processed: 99134
Processed: 99953
Processed: 99884
Processed: 99891
Processed: 100036
Processed: 100037
Processed: 98182
Processed: 98393
Processed: 99091
Processed: 98359
Processed: 99515
Processed: 100710
Processed: 99065
Processed: 100507
Processed: 99915
Processed: 96591
Processed: 97256
Processed: 100400
Processed: 99551
Processed: 7829
Processed: 100520
Processed: 99480
Processed: 100201
Processed: 99145
Processed: 100898
Processed: 100403
Processed: 99873
Processed: 99761
Processed: 99590
Processed: 99795
Processed: 100142
Processed: 99396
Processed: 99607
Processed: 98091
Processed: 97379
Processed: 98045
Processed: 98448
Processed: 97853
Processed: 98633
Processed: 96140
16000000 in 96518.3 ms
我想这是因为 std::vector
每次没有空间容纳新项目时都会以 2 的幂增长。发生这种情况时,它会将当前内容复制到新分配的 space。当项目数量变大时,延迟会很明显。
还要找出 CPU 浪费最多的行,使用探查器是非常必要的。另一种尝试注释和取消注释部分代码的方法无法准确了解问题出在哪里以及如何优化。
- 测试期间的程序在 QT 调试器中 运行ning。
- 当发布时,运行 它从 Windows 就像 EXE 一样,它的运行速度快了 20 倍!并且没有那些减速。
所以,解决方案:
- 不要相信 运行 你在 QT 调试器中的代码 - 它比真正的应用程序慢好几倍!
- 不测量时间/不估计 QT 调试器的性能。
realeas exe 的新打印输出:
Processed: 1923592
Processed: 2062627
Processed: 1993109
Processed: 2541079
Processed: 1666522
Processed: 2562799
Processed: 1186022
Processed: 2578212
Processed: 2590275
40000000 in 9599 ms
请帮忙看看下面代码性能不断下降的原因是什么。 如果您 运行 代码,您会看到它每秒打印出多次迭代。 10-15 秒后,每秒迭代次数比开始时小 x2。 一开始问题出在QT,后来改代码用std测试,问题重现。
#include <iostream>
#include <vector>
#include <memory>
#include <string>
#include <chrono>
using namespace std;
int64_t GetUsecTime()
{
return std::chrono::time_point_cast<std::chrono::microseconds>(std::chrono::high_resolution_clock::now()).time_since_epoch().count();
}
namespace IType {
enum T {
oI = 0,
oI_2 = 1,
};
}
class BaseClass
{
public:
BaseClass(IType::T t)
: m_type(t)
{}
virtual ~BaseClass() {}
IType::T type()
{
return m_type;
}
private:
IType::T m_type;
};
class InheritedClass : public BaseClass
{
public:
static shared_ptr<InheritedClass> CreateShared(const string& some)
{
//return shared_ptr<InheritedClass>(new InheritedClass(some));
return make_shared<InheritedClass>(some);
}
static shared_ptr<InheritedClass> CreateShared()
{
//return shared_ptr<InheritedClass>(new InheritedClass(some));
return make_shared<InheritedClass>();
}
InheritedClass(const string& some)
: BaseClass(IType::oI)
, someFiled(some)
{}
InheritedClass()
: BaseClass(IType::oI)
, someFiled("")
{}
void setField(const string& value)
{
someFiled = value;
}
private:
string someFiled;
};
class InheritedClass_2 : public BaseClass
{
public:
static shared_ptr<InheritedClass_2> CreateShared()
{
//return shared_ptr<InheritedClass_2>(new InheritedClass_2());
return make_shared<InheritedClass_2>();
}
InheritedClass_2()
: BaseClass(IType::oI_2)
{}
void setField(int value)
{
someFiled = value;
}
private:
int someFiled;
};
class someclass
{
public:
someclass()
{
int64_t dataSize(0);
int64_t oldDataSize(0);
int64_t time0(GetUsecTime());
int64_t time(time0);
for (int i = 0; i<8000000; i++)
{
// TEST CONSTRUCTORS
//auto instance=InheritedClass::CreateShared(); // No Problem...
auto instance = InheritedClass::CreateShared(""); // PROBLEM!! With this line, speed is constantly decreasing
//TEST CODE:
//instance->setField("just something"); // PROBLEM!! With this line, speed is constantly decreasing
V.push_back(instance);
auto instance_2 = InheritedClass_2::CreateShared();
instance_2->setField(i);
V.push_back(instance_2);
dataSize = i;
if (GetUsecTime() - time > 1000000)
{
time = GetUsecTime();
cout << "Processed: " << dataSize - oldDataSize << endl;
oldDataSize = dataSize;
}
}
cout << V.size() << " in " << (GetUsecTime() - time0) / 1000.0 << " ms" << endl;
}
private:
vector<shared_ptr<BaseClass> > V;
};
int main(int argc, char *argv[])
{
someclass s;
return 0;
}
QT中的代码运行和Visual studio。 下面 QT 程序的打印输出(从 220000 减少到 67000:
Processed: 31681
Processed: 196038
Processed: 234112
Processed: 229468
Processed: 216378
Processed: 227070
Processed: 198330
Processed: 211321
Processed: 197137
Processed: 151333
Processed: 167995
Processed: 168307
Processed: 163719
Processed: 153696
Processed: 110894
Processed: 143917
Processed: 137006
Processed: 129974
Processed: 127678
Processed: 124093
Processed: 124029
Processed: 123018
Processed: 118595
Processed: 116676
Processed: 115023
Processed: 73030
Processed: 111768
Processed: 110222
Processed: 103588
Processed: 106266
Processed: 105271
Processed: 105031
Processed: 102042
Processed: 100258
Processed: 99404
Processed: 98955
Processed: 97007
Processed: 95901
Processed: 93696
Processed: 91405
Processed: 91061
Processed: 90175
Processed: 87727
Processed: 87448
Processed: 85510
Processed: 84238
Processed: 41837
Processed: 76040
Processed: 75694
Processed: 82918
Processed: 81515
Processed: 80957
Processed: 79657
Processed: 80840
Processed: 79110
Processed: 78720
Processed: 76078
Processed: 76067
Processed: 75412
Processed: 75546
Processed: 74052
Processed: 73386
Processed: 69608
Processed: 66880
Processed: 68731
Processed: 70560
Processed: 68979
Processed: 69985
Processed: 70516
Processed: 68464
Processed: 67379
Processed: 67980
Processed: 67746
Processed: 67332
16000000 in 74537 ms
下面VisualSutio程序的打印输出(初始值较小,不太稳定,但没有减少):
Processed: 81138
Processed: 78107
Processed: 81158
Processed: 101733
Processed: 69418
Processed: 99900
Processed: 54649
Processed: 94161
Processed: 95660
Processed: 31477
Processed: 97066
Processed: 97588
Processed: 99001
Processed: 99554
Processed: 492
Processed: 99197
Processed: 100049
Processed: 99765
Processed: 100066
Processed: 97667
Processed: 93807
Processed: 100146
Processed: 99378
Processed: 99824
Processed: 98228
Processed: 97943
Processed: 99552
Processed: 100299
Processed: 99753
Processed: 90703
Processed: 98276
Processed: 99480
Processed: 99569
Processed: 99528
Processed: 99058
Processed: 98939
Processed: 97637
Processed: 99334
Processed: 99713
Processed: 99540
Processed: 99212
Processed: 99339
Processed: 98781
Processed: 40334
Processed: 98810
Processed: 99134
Processed: 99953
Processed: 99884
Processed: 99891
Processed: 100036
Processed: 100037
Processed: 98182
Processed: 98393
Processed: 99091
Processed: 98359
Processed: 99515
Processed: 100710
Processed: 99065
Processed: 100507
Processed: 99915
Processed: 96591
Processed: 97256
Processed: 100400
Processed: 99551
Processed: 7829
Processed: 100520
Processed: 99480
Processed: 100201
Processed: 99145
Processed: 100898
Processed: 100403
Processed: 99873
Processed: 99761
Processed: 99590
Processed: 99795
Processed: 100142
Processed: 99396
Processed: 99607
Processed: 98091
Processed: 97379
Processed: 98045
Processed: 98448
Processed: 97853
Processed: 98633
Processed: 96140
16000000 in 96518.3 ms
我想这是因为 std::vector
每次没有空间容纳新项目时都会以 2 的幂增长。发生这种情况时,它会将当前内容复制到新分配的 space。当项目数量变大时,延迟会很明显。
还要找出 CPU 浪费最多的行,使用探查器是非常必要的。另一种尝试注释和取消注释部分代码的方法无法准确了解问题出在哪里以及如何优化。
- 测试期间的程序在 QT 调试器中 运行ning。
- 当发布时,运行 它从 Windows 就像 EXE 一样,它的运行速度快了 20 倍!并且没有那些减速。
所以,解决方案:
- 不要相信 运行 你在 QT 调试器中的代码 - 它比真正的应用程序慢好几倍!
- 不测量时间/不估计 QT 调试器的性能。
realeas exe 的新打印输出:
Processed: 1923592
Processed: 2062627
Processed: 1993109
Processed: 2541079
Processed: 1666522
Processed: 2562799
Processed: 1186022
Processed: 2578212
Processed: 2590275
40000000 in 9599 ms