More than two years have passed since the last check of the LLVM project code using our PVS-Studio analyzer. Let's make sure that the PVS-Studio analyzer is still the leading tool for detecting errors and potential vulnerabilities. To do this, let's check and find new bugs in the LLVM 8.0.0 release.
Article to be written
To be honest, I did not want to write this article. It is not interesting to write about a project that we have repeatedly checked (
Every time a new version of LLVM is released or updated
Look, the new version of Clang Static Analyzer has learned how to find new bugs! It seems to me that the relevance of using PVS-Studio is decreasing. Clang finds more bugs than before and catches up with PVS-Studio in terms of capabilities. What do you think about it?
To this I always want to answer something along the lines of:
We don't sit idle either! We have significantly improved the capabilities of the PVS-Studio analyzer. So don't worry, we continue to lead as before.
Unfortunately this is a bad answer. It has no proofs. And that is why I am writing this article now. So, the LLVM project was once again checked and various errors were found in it. Those that seemed interesting to me, I will now demonstrate. These errors cannot be found by Clang Static Analyzer (or it is extremely inconvenient to do it with its help). And we can. And I found and wrote out all these errors in one evening.
But the writing of the article was delayed for several weeks. I just couldn't bring myself to put it all in the form of text :).
By the way, if you are interested in what technologies are used in the PVS-Studio analyzer to detect errors and potential vulnerabilities, then I suggest you familiarize yourself with this
New and old diagnostics
As already noted, about two years ago, the LLVM project was once again checked, and the errors found were corrected. Now this article will introduce a new batch of errors. Why were new bugs found? There are 3 reasons for this:
- The LLVM project is evolving, old code is being changed and new code is appearing. Naturally, there are new errors in the modified and written code. This demonstrates well that static analysis should be applied regularly and not on a case-by-case basis. Our articles show well the capabilities of the PVS-Studio analyzer, but this has nothing to do with improving the quality of the code and reducing the cost of fixing errors. Use a static code analyzer regularly!
- We refine and improve existing diagnostics. Therefore, the analyzer can detect errors that were not noticed during previous checks.
- PVS-Studio has new diagnostics, which didn't exist 2 years ago. I decided to separate them into a separate section to visually show the development of PVS-Studio.
Defects identified by diagnostics that existed 2 years ago
Fragment N1: Copy-Paste
static bool ShouldUpgradeX86Intrinsic(Function *F, StringRef Name) {
if (Name == "addcarryx.u32" || // Added in 8.0
....
Name == "avx512.mask.cvtps2pd.128" || // Added in 7.0
Name == "avx512.mask.cvtps2pd.256" || // Added in 7.0
Name == "avx512.cvtusi2sd" || // Added in 7.0
Name.startswith("avx512.mask.permvar.") || // Added in 7.0 // <=
Name.startswith("avx512.mask.permvar.") || // Added in 7.0 // <=
Name == "sse2.pmulu.dq" || // Added in 7.0
Name == "sse41.pmuldq" || // Added in 7.0
Name == "avx2.pmulu.dq" || // Added in 7.0
....
}
PVS-Studio warning:
It is double checked that the name starts with the substring "avx512.mask.permvar.". In the second check, they clearly wanted to write something else, but forgot to correct the copied text.
Fragment N2: Typo
enum CXNameRefFlags {
CXNameRange_WantQualifier = 0x1,
CXNameRange_WantTemplateArgs = 0x2,
CXNameRange_WantSinglePiece = 0x4
};
void AnnotateTokensWorker::HandlePostPonedChildCursor(
CXCursor Cursor, unsigned StartTokenIndex) {
const auto flags = CXNameRange_WantQualifier | CXNameRange_WantQualifier;
....
}
PVS-Studio warning: V501 There are identical sub-expressions 'CXNameRange_WantQualifier' to the left and to the right of the '|' operator. CIndex.cpp 7245
The same named constant is used twice due to a typo CXNameRange_WantQualifier.
Fragment N3: Operator precedence confusion
int PPCTTIImpl::getVectorInstrCost(unsigned Opcode, Type *Val, unsigned Index) {
....
if (ISD == ISD::EXTRACT_VECTOR_ELT && Index == ST->isLittleEndian() ? 1 : 0)
return 0;
....
}
PVS-Studio warning:
In my opinion, this is a very beautiful mistake. Yes, I know that I have strange ideas about beauty :).
Now, according to
(ISD == ISD::EXTRACT_VECTOR_ELT && (Index == ST->isLittleEndian())) ? 1 : 0
From a practical point of view, such a condition does not make sense, since it can be reduced to:
(ISD == ISD::EXTRACT_VECTOR_ELT && Index == ST->isLittleEndian())
This is a clear mistake. Most likely, they wanted to compare 0/1 with a variable Index. To fix the code, you need to add parentheses around the ternary operator:
if (ISD == ISD::EXTRACT_VECTOR_ELT && Index == (ST->isLittleEndian() ? 1 : 0))
By the way, the ternary operator is very dangerous and provokes logical errors. Be very careful with it and don't be greedy with parentheses. I have looked at this topic in more detail.
Fragment N4, N5: Null pointer
Init *TGParser::ParseValue(Record *CurRec, RecTy *ItemType, IDParseMode Mode) {
....
TypedInit *LHS = dyn_cast<TypedInit>(Result);
....
LHS = dyn_cast<TypedInit>(
UnOpInit::get(UnOpInit::CAST, LHS, StringRecTy::get())
->Fold(CurRec));
if (!LHS) {
Error(PasteLoc, Twine("can't cast '") + LHS->getAsString() +
"' to string");
return nullptr;
}
....
}
PVS-Studio warning:
If the pointer LHS is null, then a warning should be issued. However, instead, this very null pointer will be dereferenced: LHS->getAsString().
This is a very typical situation when an error is hidden in an error handler, since no one is testing them. Static analyzers check all reachable code, no matter how often it is used. This is a very good example of how static analysis complements other testing and error prevention techniques.
Similar pointer handling error RHS allowed in code just below: V522 [CWE-476] Dereferencing of the null pointer 'RHS' might take place. TGParser.cpp 2186
Fragment N6: Using the pointer after moving
static Expected<bool>
ExtractBlocks(....)
{
....
std::unique_ptr<Module> ProgClone = CloneModule(BD.getProgram(), VMap);
....
BD.setNewProgram(std::move(ProgClone)); // <=
MiscompiledFunctions.clear();
for (unsigned i = 0, e = MisCompFunctions.size(); i != e; ++i) {
Function *NewF = ProgClone->getFunction(MisCompFunctions[i].first); // <=
assert(NewF && "Function not found??");
MiscompiledFunctions.push_back(NewF);
}
....
}
PVS-Studio warning: V522 [CWE-476] Dereferencing of the null pointer 'ProgClone' might take place. miscompilation.cpp 601
Start smart pointer ProgClone ceases to own the object:
BD.setNewProgram(std::move(ProgClone));
In fact, now ProgClone is a null pointer. Therefore, a null pointer should be dereferenced just below:
Function *NewF = ProgClone->getFunction(MisCompFunctions[i].first);
But, in fact, this will not happen! Note that the loop is not actually being executed.
At the beginning of the container MiscompiledFunctions cleared:
MiscompiledFunctions.clear();
Next, the size of this container is used in the loop condition:
for (unsigned i = 0, e = MisCompFunctions.size(); i != e; ++i) {
It is easy to see that the loop is not running. I think this is also a mistake, and the code should be written differently.
It seems that we have met the very famous parity of errors! One mistake masks another :).
Fragment N7: Using the pointer after moving
static Expected<bool> TestOptimizer(BugDriver &BD, std::unique_ptr<Module> Test,
std::unique_ptr<Module> Safe) {
outs() << " Optimizing functions being tested: ";
std::unique_ptr<Module> Optimized =
BD.runPassesOn(Test.get(), BD.getPassesToRun());
if (!Optimized) {
errs() << " Error running this sequence of passes"
<< " on the input program!n";
BD.setNewProgram(std::move(Test)); // <=
BD.EmitProgressBitcode(*Test, "pass-error", false); // <=
if (Error E = BD.debugOptimizerCrash())
return std::move(E);
return false;
}
....
}
PVS-Studio warning: V522 [CWE-476] Dereferencing of the null pointer 'Test' might take place. miscompilation.cpp 709
Again the same situation. At the beginning, the contents of the object are moved, and then it is used as if nothing had happened. I see this situation more and more often in program code, after C++ introduced move semantics. This is why I love the C++ language! There are more and more ways to shoot yourself in the foot. The PVS-Studio analyzer will always have a job :).
Fragment N8: Null pointer
void FunctionDumper::dump(const PDBSymbolTypeFunctionArg &Symbol) {
uint32_t TypeId = Symbol.getTypeId();
auto Type = Symbol.getSession().getSymbolById(TypeId);
if (Type)
Printer << "<unknown-type>";
else
Type->dump(*this);
}
PVS-Studio warning: V522 [CWE-476] Dereferencing of the null pointer 'Type' might take place. PrettyFunctionDumper.cpp 233
In addition to error handlers, debug printout functions are usually not tested either. We have just such a case. The function is waiting for the user, who, instead of solving their problems, will be forced to fix it.
Correctly:
if (Type)
Type->dump(*this);
else
Printer << "<unknown-type>";
Fragment N9: Null pointer
void SearchableTableEmitter::collectTableEntries(
GenericTable &Table, const std::vector<Record *> &Items) {
....
RecTy *Ty = resolveTypes(Field.RecType, TI->getType());
if (!Ty) // <=
PrintFatalError(Twine("Field '") + Field.Name + "' of table '" +
Table.Name + "' has incompatible type: " +
Ty->getAsString() + " vs. " + // <=
TI->getType()->getAsString());
....
}
PVS-Studio warning: V522 [CWE-476] Dereferencing of the null pointer 'Ty' might take place. SearchableTableEmitter.cpp 614
I think that everything is clear and does not require explanation.
Fragment N10: Typo
bool FormatTokenLexer::tryMergeCSharpNullConditionals() {
....
auto &Identifier = *(Tokens.end() - 2);
auto &Question = *(Tokens.end() - 1);
....
Identifier->ColumnWidth += Question->ColumnWidth;
Identifier->Type = Identifier->Type; // <=
Tokens.erase(Tokens.end() - 1);
return true;
}
PVS-Studio warning:
There is no point in assigning a variable to itself. They probably meant to write:
Identifier->Type = Question->Type;
Fragment N11: Suspicious break
void SystemZOperand::print(raw_ostream &OS) const {
switch (Kind) {
break;
case KindToken:
OS << "Token:" << getToken();
break;
case KindReg:
OS << "Reg:" << SystemZInstPrinter::getRegisterName(getReg());
break;
....
}
PVS-Studio warning:
There is a very suspicious operator at the beginning break. Did you forget to write something else here?
Fragment N12: Pointer check after dereference
InlineCost AMDGPUInliner::getInlineCost(CallSite CS) {
Function *Callee = CS.getCalledFunction();
Function *Caller = CS.getCaller();
TargetTransformInfo &TTI = TTIWP->getTTI(*Callee);
if (!Callee || Callee->isDeclaration())
return llvm::InlineCost::getNever("undefined callee");
....
}
PVS-Studio warning:
Index callee at the beginning is dereferenced at the time of the function call getTTI.
And then it turns out that this pointer should be checked for equality nullptr:
if (!Callee || Callee->isDeclaration())
But it's' too late…
Fragment N13 - N…: Pointer check after dereference
The situation discussed in the previous code snippet is not unique. She meets here:
static Value *optimizeDoubleFP(CallInst *CI, IRBuilder<> &B,
bool isBinary, bool isPrecise = false) {
....
Function *CalleeFn = CI->getCalledFunction();
StringRef CalleeNm = CalleeFn->getName(); // <=
AttributeList CalleeAt = CalleeFn->getAttributes();
if (CalleeFn && !CalleeFn->isIntrinsic()) { // <=
....
}
PVS-Studio warning: V595 [CWE-476] The 'CalleeFn' pointer was utilized before it was verified against nullptr. Check lines: 1079, 1081. SimplifyLibCalls.cpp 1079
And here:
void Sema::InstantiateAttrs(const MultiLevelTemplateArgumentList &TemplateArgs,
const Decl *Tmpl, Decl *New,
LateInstantiatedAttrVec *LateAttrs,
LocalInstantiationScope *OuterMostScope) {
....
NamedDecl *ND = dyn_cast<NamedDecl>(New);
CXXRecordDecl *ThisContext =
dyn_cast_or_null<CXXRecordDecl>(ND->getDeclContext()); // <=
CXXThisScopeRAII ThisScope(*this, ThisContext, Qualifiers(),
ND && ND->isCXXInstanceMember()); // <=
....
}
PVS-Studio warning: V595 [CWE-476] The 'ND' pointer was utilized before it was verified against nullptr. Check lines: 532, 534. SemaTemplateInstantiateDecl.cpp 532
And here:
- V595 [CWE-476] The 'U' pointer was utilized before it was verified against nullptr. Check lines: 404, 407. DWARFFormValue.cpp 404
- V595 [CWE-476] The 'ND' pointer was utilized before it was verified against nullptr. Check lines: 2149, 2151. SemaTemplateInstantiate.cpp 2149
And then it became uninteresting for me to study warnings with the number V595. So I don't know if there are other similar errors besides those listed here. Most likely there is.
Fragment N17, N18: Suspicious shift
static inline bool processLogicalImmediate(uint64_t Imm, unsigned RegSize,
uint64_t &Encoding) {
....
unsigned Size = RegSize;
....
uint64_t NImms = ~(Size-1) << 1;
....
}
PVS-Studio warning:
Perhaps this is not a bug, and the code works exactly as intended. But this is clearly a very suspicious place and needs to be checked.
Let's say a variable Size is 16, and then the author of the code planned to get in the variable NImms value:
1111111111111111111111111111111111111111111111111111111111100000
However, in reality, the value will be:
0000000000000000000000000000000011111111111111111111111111100000
The matter is that all calculations occur with use of 32-bit unsigned type. And only then, this 32-bit unsigned type will be implicitly expanded to uint64_t. In this case, the high bits will be zero.
You can fix the situation like this:
uint64_t NImms = ~static_cast<uint64_t>(Size-1) << 1;
Similar situation: V629 [CWE-190] Consider inspecting the 'Immr << 6' expression. Bit shifting of the 32-bit value with a subsequent expansion to the 64-bit type. AArch64AddressingModes.h 269
Fragment N19: Keyword missing else?
void AMDGPUAsmParser::cvtDPP(MCInst &Inst, const OperandVector &Operands) {
....
if (Op.isReg() && Op.Reg.RegNo == AMDGPU::VCC) {
// VOP2b (v_add_u32, v_sub_u32 ...) dpp use "vcc" token.
// Skip it.
continue;
} if (isRegOrImmWithInputMods(Desc, Inst.getNumOperands())) { // <=
Op.addRegWithFPInputModsOperands(Inst, 2);
} else if (Op.isDPPCtrl()) {
Op.addImmOperands(Inst, 1);
} else if (Op.isImm()) {
// Handle optional arguments
OptionalIdx[Op.getImmTy()] = I;
} else {
llvm_unreachable("Invalid operand type");
}
....
}
PVS-Studio warning:
There is no error here. Since the then-block of the first if ends in keep on going, it doesn't matter, there is a keyword else or not. Either way, the code will work the same way. However, missed else makes the code more obscure and dangerous. If in the future keep on going disappear, the code will start to work in a completely different way. I think it's better to add else.
Fragment N20: Four typos of the same type
LLVM_DUMP_METHOD void Symbol::dump(raw_ostream &OS) const {
std::string Result;
if (isUndefined())
Result += "(undef) ";
if (isWeakDefined())
Result += "(weak-def) ";
if (isWeakReferenced())
Result += "(weak-ref) ";
if (isThreadLocalValue())
Result += "(tlv) ";
switch (Kind) {
case SymbolKind::GlobalSymbol:
Result + Name.str(); // <=
break;
case SymbolKind::ObjectiveCClass:
Result + "(ObjC Class) " + Name.str(); // <=
break;
case SymbolKind::ObjectiveCClassEHType:
Result + "(ObjC Class EH) " + Name.str(); // <=
break;
case SymbolKind::ObjectiveCInstanceVariable:
Result + "(ObjC IVar) " + Name.str(); // <=
break;
}
OS << Result;
}
PVS-Studio warnings:
- V655 [CWE-480] The strings were concatenated but are not utilized. Consider inspecting the 'Result + Name.str()' expression. Symbol.cpp 32
- V655 [CWE-480] The strings were concatenated but are not utilized. Consider inspecting the 'Result + "(ObjC Class) " + Name.str()' expression. Symbol.cpp 35
- V655 [CWE-480] The strings were concatenated but are not utilized. Consider inspecting the 'Result + "(ObjC Class EH) " + Name.str()' expression. Symbol.cpp 38
- V655 [CWE-480] The strings were concatenated but are not utilized. Consider inspecting the 'Result + "(ObjC IVar) " + Name.str()' expression. Symbol.cpp 41
By chance, the + operator is used instead of the += operator. The result is constructs that are devoid of meaning.
Fragment N21: Undefined behavior
static void getReqFeatures(std::map<StringRef, int> &FeaturesMap,
const std::vector<Record *> &ReqFeatures) {
for (auto &R : ReqFeatures) {
StringRef AsmCondString = R->getValueAsString("AssemblerCondString");
SmallVector<StringRef, 4> Ops;
SplitString(AsmCondString, Ops, ",");
assert(!Ops.empty() && "AssemblerCondString cannot be empty");
for (auto &Op : Ops) {
assert(!Op.empty() && "Empty operator");
if (FeaturesMap.find(Op) == FeaturesMap.end())
FeaturesMap[Op] = FeaturesMap.size();
}
}
}
Try to find the dangerous code yourself. And this is a picture to distract attention so as not to immediately look at the answer:
PVS-Studio warning:
Problematic line:
FeaturesMap[Op] = FeaturesMap.size();
If element Op not found, then a new element is created in the map and the number of elements in this map is written there. It's just unknown whether the function will be called size before or after adding a new element.
Fragment N22-N24: Repeated Assignments
Error MachOObjectFile::checkSymbolTable() const {
....
} else {
MachO::nlist STE = getSymbolTableEntry(SymDRI);
NType = STE.n_type; // <=
NType = STE.n_type; // <=
NSect = STE.n_sect;
NDesc = STE.n_desc;
NStrx = STE.n_strx;
NValue = STE.n_value;
}
....
}
PVS-Studio warning:
I don't think there is a real error here. Just an extra repeated assignment. But still a bug.
Similar to:
- V519 [CWE-563] The 'B.NDesc' variable is assigned values twice successively. Perhaps this is a mistake. Check lines: 1488, 1489. llvm-nm.cpp 1489
- V519 [CWE-563] The variable is assigned values twice successively. Perhaps this is a mistake. Check lines: 59, 61. coff2yaml.cpp 61
Fragment N25-N27: More re-assignments
Now consider a slightly different version of reassignment.
bool Vectorizer::vectorizeLoadChain(
ArrayRef<Instruction *> Chain,
SmallPtrSet<Instruction *, 16> *InstructionsProcessed) {
....
unsigned Alignment = getAlignment(L0);
....
unsigned NewAlign = getOrEnforceKnownAlignment(L0->getPointerOperand(),
StackAdjustedAlignment,
DL, L0, nullptr, &DT);
if (NewAlign != 0)
Alignment = NewAlign;
Alignment = NewAlign;
....
}
PVS-Studio warning: V519 [CWE-563] The 'Alignment' variable is assigned values twice successively. Perhaps this is a mistake. Check lines: 1158, 1160. LoadStoreVectorizer.cpp 1160
This is a very strange code that apparently contains a logical error. At the beginning, variable Alignment is assigned a value depending on the condition. And then the assignment happens again, but now without any check.
Similar situations can be seen here:
- V519 [CWE-563] The 'Effects' variable is assigned values twice successively. Perhaps this is a mistake. Check lines: 152, 165. WebAssemblyRegStackify.cpp 165
- V519 [CWE-563] The 'ExpectNoDerefChunk' variable is assigned values twice successively. Perhaps this is a mistake. Check lines: 4970, 4973. SemaType.cpp 4973
Fragment N28: Always true condition
static int readPrefixes(struct InternalInstruction* insn) {
....
uint8_t byte = 0;
uint8_t nextByte;
....
if (byte == 0xf3 && (nextByte == 0x88 || nextByte == 0x89 ||
nextByte == 0xc6 || nextByte == 0xc7)) {
insn->xAcquireRelease = true;
if (nextByte != 0x90) // PAUSE instruction support // <=
break;
}
....
}
PVS-Studio warning:
The check doesn't make sense. Variable nextByte always not equal to value 0x90, which follows from the previous check. This is some kind of logical error.
Fragment N29 - N...: Always true/false conditions
The analyzer gives a lot of warnings that the entire condition (
static DecodeStatus DecodeGPRPairRegisterClass(MCInst &Inst, unsigned RegNo,
uint64_t Address, const void *Decoder) {
DecodeStatus S = MCDisassembler::Success;
if (RegNo > 13)
return MCDisassembler::Fail;
if ((RegNo & 1) || RegNo == 0xe)
S = MCDisassembler::SoftFail;
....
}
PVS-Studio warning:
The constant 0xE is the value 14 in decimal. Examination RegNo == 0xe doesn't make sense, because if RegNo > 13, then the function will complete its execution.
There were many other warnings with ID V547 and V560, but as with
I will give an example why it is boring to study these responses. The analyzer is absolutely right in issuing a warning for the following code. But this is not a mistake.
bool UnwrappedLineParser::parseBracedList(bool ContinueOnSemicolons,
tok::TokenKind ClosingBraceKind) {
bool HasError = false;
....
HasError = true;
if (!ContinueOnSemicolons)
return !HasError;
....
}
PVS-Studio warning: V547 [CWE-570] Expression '!HasError' is always false. UnwrappedLineParser.cpp 1635
Fragment N30: Suspicious return
static bool
isImplicitlyDef(MachineRegisterInfo &MRI, unsigned Reg) {
for (MachineRegisterInfo::def_instr_iterator It = MRI.def_instr_begin(Reg),
E = MRI.def_instr_end(); It != E; ++It) {
return (*It).isImplicitDef();
}
....
}
PVS-Studio warning:
This is either a bug or a specific technique that is designed to explain something to programmers who read the code. This design does not explain anything to me and looks very suspicious. It's better not to write like that :).
Tired? Then it's time to make tea or coffee.
Defects revealed by new diagnostics
I think 30 triggers of the old diagnostics is enough. Now let's see what interesting things can be found with new diagnostics that appeared in the analyzer after
Fragment N31: Unreachable code
Error CtorDtorRunner::run() {
....
if (auto CtorDtorMap =
ES.lookup(JITDylibSearchList({{&JD, true}}), std::move(Names),
NoDependenciesToRegister, true))
{
....
return Error::success();
} else
return CtorDtorMap.takeError();
CtorDtorsByPriority.clear();
return Error::success();
}
PVS-Studio warning:
As you can see, both operator branches if end with an operator call return. Accordingly, the container CtorDtorsByPriority will never be cleared.
Fragment N32: Unreachable code
bool LLParser::ParseSummaryEntry() {
....
switch (Lex.getKind()) {
case lltok::kw_gv:
return ParseGVEntry(SummaryID);
case lltok::kw_module:
return ParseModuleEntry(SummaryID);
case lltok::kw_typeid:
return ParseTypeIdEntry(SummaryID); // <=
break; // <=
default:
return Error(Lex.getLoc(), "unexpected summary kind");
}
Lex.setIgnoreColonInIdentifiers(false); // <=
return false;
}
PVS-Studio warning: V779 [CWE-561] Unreachable code detected. It is possible that an error is present. LLParser.cpp 835
An interesting situation. Let's take a look at this place first:
return ParseTypeIdEntry(SummaryID);
break;
At first glance, it seems that there is no error here. It looks like the operator break is superfluous and can be simply removed. However, not all so simple.
The analyzer issues a warning on the lines:
Lex.setIgnoreColonInIdentifiers(false);
return false;
Indeed, this code is unreachable. All cases in Switch end with operator call return. And now senseless lonely break doesn't look so harmless! Perhaps one of the branches should end with break, not on return?
Fragment N33: Random zeroing of high bits
unsigned getStubAlignment() override {
if (Arch == Triple::systemz)
return 8;
else
return 1;
}
Expected<unsigned>
RuntimeDyldImpl::emitSection(const ObjectFile &Obj,
const SectionRef &Section,
bool IsCode) {
....
uint64_t DataSize = Section.getSize();
....
if (StubBufSize > 0)
DataSize &= ~(getStubAlignment() - 1);
....
}
PVS-Studio warning:
Please note that the function getStubAlignment returns type unsigned. Let's calculate the value of the expression, assuming that the function returns the value 8:
~(getStubAlignment() - 1)
~(8u-1)
0xFFFFFFF8u
Now notice that the variable DataSize has a 64-bit unsigned type. It turns out that when performing the DataSize & 0xFFFFFFF8u operation, all thirty-two high-order bits will be set to zero. Most likely, this is not what the programmer wanted. I suspect that he wanted to calculate: DataSize & 0xFFFFFFFFFFFFFFF8u.
To fix the error, you should write like this:
DataSize &= ~(static_cast<uint64_t>(getStubAlignment()) - 1);
Or so:
DataSize &= ~(getStubAlignment() - 1ULL);
Fragment N34: Unsuccessful Explicit Type Casting
template <typename T>
void scaleShuffleMask(int Scale, ArrayRef<T> Mask,
SmallVectorImpl<T> &ScaledMask) {
assert(0 < Scale && "Unexpected scaling factor");
int NumElts = Mask.size();
ScaledMask.assign(static_cast<size_t>(NumElts * Scale), -1);
....
}
PVS-Studio warning:
Explicit type casting is used to avoid overflow when multiplying type variables int. However, here explicit type casting does not protect against overflow. At the beginning, the variables will be multiplied, and only then the 32-bit result of the multiplication will be expanded to type
Fragment N35: Unsuccessful Copy-Paste
Instruction *InstCombiner::visitFCmpInst(FCmpInst &I) {
....
if (!match(Op0, m_PosZeroFP()) && isKnownNeverNaN(Op0, &TLI)) {
I.setOperand(0, ConstantFP::getNullValue(Op0->getType()));
return &I;
}
if (!match(Op1, m_PosZeroFP()) && isKnownNeverNaN(Op1, &TLI)) {
I.setOperand(1, ConstantFP::getNullValue(Op0->getType())); // <=
return &I;
}
....
}
This new interesting diagnostic reveals situations when a piece of code was copied, and some names were changed in it, but they were not corrected in one place.
Please note that in the second block we changed Op0 on Op1. But in one place they did not fix it. Most likely it should have been written like this:
if (!match(Op1, m_PosZeroFP()) && isKnownNeverNaN(Op1, &TLI)) {
I.setOperand(1, ConstantFP::getNullValue(Op1->getType()));
return &I;
}
Fragment N36: Confusion in variables
struct Status {
unsigned Mask;
unsigned Mode;
Status() : Mask(0), Mode(0){};
Status(unsigned Mask, unsigned Mode) : Mask(Mask), Mode(Mode) {
Mode &= Mask;
};
....
};
PVS-Studio warning:
It is very dangerous to give function arguments the same names as class members. It's very easy to get confused. We have just such a case. This expression doesn't make sense:
Mode &= Mask;
The function argument changes. And that's it. This argument is no longer used. Most likely, it should have been written like this:
Status(unsigned Mask, unsigned Mode) : Mask(Mask), Mode(Mode) {
this->Mode &= Mask;
};
Fragment N37: Confusion in variables
class SectionBase {
....
uint64_t Size = 0;
....
};
class SymbolTableSection : public SectionBase {
....
};
void SymbolTableSection::addSymbol(Twine Name, uint8_t Bind, uint8_t Type,
SectionBase *DefinedIn, uint64_t Value,
uint8_t Visibility, uint16_t Shndx,
uint64_t Size) {
....
Sym.Value = Value;
Sym.Visibility = Visibility;
Sym.Size = Size;
Sym.Index = Symbols.size();
Symbols.emplace_back(llvm::make_unique<Symbol>(Sym));
Size += this->EntrySize;
}
PVS-Studio warning: V1001 [CWE-563] The 'Size' variable is assigned but is not used by the end of the function. Object.cpp 424
The situation is similar to the previous one. Should be written:
this->Size += this->EntrySize;
Fragment N38-N47: Forgot to check the pointer
Previously, we considered examples of triggering diagnostics
int getGEPCost(Type *PointeeType, const Value *Ptr,
ArrayRef<const Value *> Operands) {
....
if (Ptr != nullptr) { // <=
assert(....);
BaseGV = dyn_cast<GlobalValue>(Ptr->stripPointerCasts());
}
bool HasBaseReg = (BaseGV == nullptr);
auto PtrSizeBits = DL.getPointerTypeSizeInBits(Ptr->getType()); // <=
....
}
PVS-Studio warning: V1004 [CWE-476] The 'Ptr' pointer was used unsafely after it was verified against nullptr. Check lines: 729, 738. TargetTransformInfoImpl.h 738
Variable Ptr may be equal nullptr, as evidenced by the check:
if (Ptr != nullptr)
However, below this pointer is dereferenced without a preliminary check:
auto PtrSizeBits = DL.getPointerTypeSizeInBits(Ptr->getType());
Let's consider another similar case.
llvm::DISubprogram *CGDebugInfo::getFunctionFwdDeclOrStub(GlobalDecl GD,
bool Stub) {
....
auto *FD = dyn_cast<FunctionDecl>(GD.getDecl());
SmallVector<QualType, 16> ArgTypes;
if (FD) // <=
for (const ParmVarDecl *Parm : FD->parameters())
ArgTypes.push_back(Parm->getType());
CallingConv CC = FD->getType()->castAs<FunctionType>()->getCallConv(); // <=
....
}
PVS-Studio warning: V1004 [CWE-476] The 'FD' pointer was used unsafely after it was verified against nullptr. Check lines: 3228, 3231. CGDebugInfo.cpp 3231
Pay attention to the pointer FD. I am sure the problem is clearly visible, and no special explanation is required.
And more:
static void computePolynomialFromPointer(Value &Ptr, Polynomial &Result,
Value *&BasePtr,
const DataLayout &DL) {
PointerType *PtrTy = dyn_cast<PointerType>(Ptr.getType());
if (!PtrTy) { // <=
Result = Polynomial();
BasePtr = nullptr;
}
unsigned PointerBits =
DL.getIndexSizeInBits(PtrTy->getPointerAddressSpace()); // <=
....
}
PVS-Studio warning: V1004 [CWE-476] The 'PtrTy' pointer was used unsafely after it was verified against nullptr. Check lines: 960, 965. InterleavedLoadCombinePass.cpp 965
How to protect yourself from such errors? Be more attentive at Code-Review and use the PVS-Studio static analyzer for regular code reviews.
There is no point in citing other code fragments with errors of this type. I will leave only a list of warnings in the article:
- V1004 [CWE-476] The 'Expr' pointer was used unsafely after it was verified against nullptr. Check lines: 1049, 1078. DebugInfoMetadata.cpp 1078
- V1004 [CWE-476] The 'PI' pointer was used unsafely after it was verified against nullptr. Check lines: 733, 753. LegacyPassManager.cpp 753
- V1004 [CWE-476] The 'StatepointCall' pointer was used unsafely after it was verified against nullptr. Check lines: 4371, 4379. Verifier.cpp 4379
- V1004 [CWE-476] The 'RV' pointer was used unsafely after it was verified against nullptr. Check lines: 2263, 2268. TGParser.cpp 2268
- V1004 [CWE-476] The 'CalleeFn' pointer was used unsafely after it was verified against nullptr. Check lines: 1081, 1096. SimplifyLibCalls.cpp 1096
- V1004 [CWE-476] The 'TC' pointer was used unsafely after it was verified against nullptr. Check lines: 1819, 1824. Driver.cpp 1824
Fragment N48-N60: Not critical, but a defect (possible memory leak)
std::unique_ptr<IRMutator> createISelMutator() {
....
std::vector<std::unique_ptr<IRMutationStrategy>> Strategies;
Strategies.emplace_back(
new InjectorIRStrategy(InjectorIRStrategy::getDefaultOps()));
....
}
PVS-Studio warning:
To add an element to the end of a type container std::vector > you can't just write xxx.push_back(new X), since there is no implicit conversion from X* в std::unique_ptr.
A common solution is to write xxx.emplace_back(new X), since it compiles: method emplace_back constructs the element directly from the arguments and can therefore use explicit constructors.
It is not safe. If the vector is full, then memory is re-allocated. The reallocate operation may fail, resulting in an exception being thrown. std::bad_alloc. In this case, the pointer will be lost and the created object will never be deleted.
The safe solution is to create unique_ptr, which will own the pointer before the vector tries to reallocate the memory:
xxx.push_back(std::unique_ptr<X>(new X))
Since C++14, 'std::make_unique' can be used:
xxx.push_back(std::make_unique<X>())
This type of defect is not critical for LLVM. If it fails to allocate memory, then the compiler will simply stop. However, for applications with a long
So, although this code does not pose a practical danger to LLVM, I found it useful to talk about this error pattern and that the PVS-Studio analyzer has learned how to detect it.
Other warnings of this type:
- V1023 [CWE-460] A pointer without owner is added to the 'Passes' container by the 'emplace_back' method. A memory leak will occur in case of an exception. PassManager.h 546
- V1023 [CWE-460] A pointer without owner is added to the 'AAs' container by the 'emplace_back' method. A memory leak will occur in case of an exception. AliasAnalysis.h 324
- V1023 [CWE-460] A pointer without owner is added to the 'Entries' container by the 'emplace_back' method. A memory leak will occur in case of an exception. DWARFDebugFrame.cpp 519
- V1023 [CWE-460] A pointer without owner is added to the 'AllEdges' container by the 'emplace_back' method. A memory leak will occur in case of an exception. CFGMST.h 268
- V1023 [CWE-460] A pointer without owner is added to the 'VMaps' container by the 'emplace_back' method. A memory leak will occur in case of an exception. SimpleLoopUnswitch.cpp 2012
- V1023 [CWE-460] A pointer without owner is added to the 'Records' container by the 'emplace_back' method. A memory leak will occur in case of an exception. FDRLogBuilder.h 30
- V1023 [CWE-460] A pointer without owner is added to the 'PendingSubmodules' container by the 'emplace_back' method. A memory leak will occur in case of an exception. ModuleMap.cpp 810
- V1023 [CWE-460] A pointer without owner is added to the 'Objects' container by the 'emplace_back' method. A memory leak will occur in case of an exception. DebugMap.cpp 88
- V1023 [CWE-460] A pointer without owner is added to the 'Strategies' container by the 'emplace_back' method. A memory leak will occur in case of an exception. llvm-isel-fuzzer.cpp 60
- V1023 [CWE-460] A pointer without owner is added to the 'Modifiers' container by the 'emplace_back' method. A memory leak will occur in case of an exception. llvm-stress.cpp 685
- V1023 [CWE-460] A pointer without owner is added to the 'Modifiers' container by the 'emplace_back' method. A memory leak will occur in case of an exception. llvm-stress.cpp 686
- V1023 [CWE-460] A pointer without owner is added to the 'Modifiers' container by the 'emplace_back' method. A memory leak will occur in case of an exception. llvm-stress.cpp 688
- V1023 [CWE-460] A pointer without owner is added to the 'Modifiers' container by the 'emplace_back' method. A memory leak will occur in case of an exception. llvm-stress.cpp 689
- V1023 [CWE-460] A pointer without owner is added to the 'Modifiers' container by the 'emplace_back' method. A memory leak will occur in case of an exception. llvm-stress.cpp 690
- V1023 [CWE-460] A pointer without owner is added to the 'Modifiers' container by the 'emplace_back' method. A memory leak will occur in case of an exception. llvm-stress.cpp 691
- V1023 [CWE-460] A pointer without owner is added to the 'Modifiers' container by the 'emplace_back' method. A memory leak will occur in case of an exception. llvm-stress.cpp 692
- V1023 [CWE-460] A pointer without owner is added to the 'Modifiers' container by the 'emplace_back' method. A memory leak will occur in case of an exception. llvm-stress.cpp 693
- V1023 [CWE-460] A pointer without owner is added to the 'Modifiers' container by the 'emplace_back' method. A memory leak will occur in case of an exception. llvm-stress.cpp 694
- V1023 [CWE-460] A pointer without owner is added to the 'Operands' container by the 'emplace_back' method. A memory leak will occur in case of an exception. GlobalISelEmitter.cpp 1911
- V1023 [CWE-460] A pointer without owner is added to the 'Stash' container by the 'emplace_back' method. A memory leak will occur in case of an exception. GlobalISelEmitter.cpp 2100
- V1023 [CWE-460] A pointer without owner is added to the 'Matchers' container by the 'emplace_back' method. A memory leak will occur in case of an exception. GlobalISelEmitter.cpp 2702
Conclusion
In total, I wrote out 60 warnings, after which I stopped. Are there any other defects that the PVS-Studio analyzer detects in LLVM? Yes, I have. However, when I wrote out the code snippets for the article, it was late evening, or rather, even night, and I decided that it was time to wrap up.
I hope you found it interesting and would like to try the PVS-Studio analyzer.
You can download the analyzer and get the trawl key at
Most importantly, use static analysis regularly. One-time checks, which we carry out in order to popularize the methodology of static analysis and PVS-Studio are not a normal scenario.
Good luck in improving the quality and reliability of the code!
If you want to share this article with an English-speaking audience, please use the translation link: Andrey Karpov.
Source: habr.com