-
Notifications
You must be signed in to change notification settings - Fork 1.7k
Java: allow the definition of java/unsafe-deserialization
sinks using data extensions
#20067
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
fdd1e3f
ad60aff
7d4a70c
805e31f
b361f76
6629bd8
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,4 @@ | ||
--- | ||
category: feature | ||
--- | ||
* You can now add sinks for the query "Deserialization of user-controlled data" (`java/unsafe-deserialization`) using [data extensions](https://codeql.github.com/docs/codeql-language-guides/customizing-library-models-for-java-and-kotlin/#extensible-predicates-used-to-create-custom-models-in-java-and-kotlin) by extending `sinkModel` and using the kind "unsafe-deserialization". The existing sinks which do not require extra logic to determine if they are unsafe are now defined in this way. |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,8 @@ | ||
--- | ||
category: deprecated | ||
--- | ||
* The module `semmle.code.java.frameworks.Castor` has been deprecated and will be removed in a future release. | ||
* The module `semmle.code.java.frameworks.JYaml` has been deprecated and will be removed in a future release. | ||
* The classes `UnsafeHessianInputReadObjectMethod` and `BurlapInputReadObjectMethod` in the module `semmle.code.java.frameworks.HessianBurlap` have been deprecated and will be removed in a future release. | ||
* The class `YamlBeansReaderReadMethod` in the module `semmle.code.java.frameworks.YamlBeans` has been deprecated and will be removed in a future release. | ||
* The class `MethodApacheSerializationUtilsDeserialize` in the module `semmle.code.java.frameworks.apache.Lang` has been deprecated and will be removed in a future release. |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,7 @@ | ||
extensions: | ||
- addsTo: | ||
pack: codeql/java-all | ||
extensible: sinkModel | ||
data: | ||
- ["com.alibaba.com.caucho.hessian.io", "AbstractHessianInput", True, "readObject", "", "", "Argument[this]", "unsafe-deserialization", "manual"] | ||
- ["com.alibaba.com.caucho.hessian.io", "Hessian2StreamingInput", True, "readObject", "", "", "Argument[this]", "unsafe-deserialization", "manual"] |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,6 @@ | ||
extensions: | ||
- addsTo: | ||
pack: codeql/java-all | ||
extensible: sinkModel | ||
data: | ||
- ["com.caucho.burlap.io", "BurlapInput", True, "readObject", "", "", "Argument[this]", "unsafe-deserialization", "manual"] |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,7 @@ | ||
extensions: | ||
- addsTo: | ||
pack: codeql/java-all | ||
extensible: sinkModel | ||
data: | ||
- ["com.caucho.hessian.io", "AbstractHessianInput", True, "readObject", "", "", "Argument[this]", "unsafe-deserialization", "manual"] | ||
- ["com.caucho.hessian.io", "Hessian2StreamingInput", True, "readObject", "", "", "Argument[this]", "unsafe-deserialization", "manual"] |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,7 @@ | ||
extensions: | ||
- addsTo: | ||
pack: codeql/java-all | ||
extensible: sinkModel | ||
data: | ||
- ["com.cedarsoftware.util.io", "JsonReader", False, "jsonToJava", "", "", "Argument[0]", "unsafe-deserialization", "manual"] | ||
- ["com.cedarsoftware.util.io", "JsonReader", True, "readObject", "", "", "Argument[this]", "unsafe-deserialization", "manual"] | ||
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,6 @@ | ||
extensions: | ||
- addsTo: | ||
pack: codeql/java-all | ||
extensible: sinkModel | ||
data: | ||
- ["com.esotericsoftware.yamlbeans", "YamlReader", True, "read", "", "", "Argument[this]", "unsafe-deserialization", "manual"] |
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -13,3 +13,8 @@ extensions: | |
- ["java.beans", "PropertyEditor", "getValue", "()", "summary", "df-manual"] # needs to be modeled by regular CodeQL matching the get and set keys to reduce FPs | ||
- ["java.beans", "PropertyEditor", "setAsText", "()", "summary", "df-manual"] # needs to be modeled by regular CodeQL matching the get and set keys to reduce FPs | ||
- ["java.beans", "PropertyEditor", "setValue", "()", "summary", "df-manual"] # needs to be modeled by regular CodeQL matching the get and set keys to reduce FPs | ||
- addsTo: | ||
pack: codeql/java-all | ||
extensible: sinkModel | ||
data: | ||
- ["java.beans", "XMLDecoder", True, "readObject", "()", "", "Argument[this]", "unsafe-deserialization", "manual"] | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Could you explain the logic behind when the models are also applied to overrides (some models are set to "True" and others are set to "False")? It looks like the QL code doesn't apply to overrides. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think I tried to go for "True" except for static methods, where I put "False" as it doesn't make any sense to override a static method. (I prefer to put "True" for those for consistency, but when doing the equivalent for Go models I was outvoted.) I may have made mistakes though. In general I think we agreed that there isn't much point to having models with "False" in that column, and you even had a PR open to make that column meaningless for C#. Hence why I defaulted to "True" without really thinking about it. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Sounds good. I was just curious about the reasoning. |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,6 @@ | ||
extensions: | ||
- addsTo: | ||
pack: codeql/java-all | ||
extensible: sinkModel | ||
data: | ||
- ["org.exolab.castor.xml", "Unmarshaller", True, "unmarshal", "", "", "Argument[0..1]", "unsafe-deserialization", "manual"] | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Does there exist multiple overloads of There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes, many overloads, some with only 1 argument and some with 2. I wasn't aware that there might be a problem - I think QL will just say "well, there isn't one of those" and move on. I feel like I have done this before, when there are overloads with different numbers of arguments, though I don't remember specifics. I ran MRVA on 1000 repos and didn't observe any problems. And, now that I look more carefully, there is a test with a call to a one-argument overload of unmarshal, and it still alerts fine. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Ok, then all is good! Thank you for checking. |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,13 @@ | ||
extensions: | ||
- addsTo: | ||
pack: codeql/java-all | ||
extensible: sinkModel | ||
data: | ||
- ["org.ho.yaml", "Yaml", False, "load", "", "", "Argument[0]", "unsafe-deserialization", "manual"] | ||
- ["org.ho.yaml", "Yaml", False, "loadStream", "", "", "Argument[0]", "unsafe-deserialization", "manual"] | ||
- ["org.ho.yaml", "Yaml", False, "loadStreamOfType", "", "", "Argument[0]", "unsafe-deserialization", "manual"] | ||
- ["org.ho.yaml", "Yaml", False, "loadType", "", "", "Argument[0]", "unsafe-deserialization", "manual"] | ||
- ["org.ho.yaml", "YamlConfig", False, "load", "", "", "Argument[0]", "unsafe-deserialization", "manual"] | ||
- ["org.ho.yaml", "YamlConfig", False, "loadStream", "", "", "Argument[0]", "unsafe-deserialization", "manual"] | ||
- ["org.ho.yaml", "YamlConfig", False, "loadStreamOfType", "", "", "Argument[0]", "unsafe-deserialization", "manual"] | ||
- ["org.ho.yaml", "YamlConfig", False, "loadType", "", "", "Argument[0]", "unsafe-deserialization", "manual"] |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,6 @@ | ||
extensions: | ||
- addsTo: | ||
pack: codeql/java-all | ||
extensible: sinkModel | ||
data: | ||
- ["org.jabsorb", "JSONSerializer", True, "fromJSON", "", "", "Argument[0]", "unsafe-deserialization", "manual"] |
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -3,17 +3,16 @@ | |
*/ | ||
|
||
import semmle.code.java.dataflow.FlowSources | ||
private import semmle.code.java.dataflow.ExternalFlow | ||
Check warningCode scanning / CodeQL Redundant import Warning
Redundant import, the module is already imported inside
semmle.code.java.dataflow.FlowSinks Error loading related location Loading |
||
private import semmle.code.java.dataflow.FlowSinks | ||
private import semmle.code.java.dispatch.VirtualDispatch | ||
private import semmle.code.java.frameworks.Kryo | ||
private import semmle.code.java.frameworks.XStream | ||
private import semmle.code.java.frameworks.SnakeYaml | ||
private import semmle.code.java.frameworks.FastJson | ||
private import semmle.code.java.frameworks.JYaml | ||
private import semmle.code.java.frameworks.JsonIo | ||
private import semmle.code.java.frameworks.YamlBeans | ||
private import semmle.code.java.frameworks.HessianBurlap | ||
private import semmle.code.java.frameworks.Castor | ||
private import semmle.code.java.frameworks.Jackson | ||
private import semmle.code.java.frameworks.Jabsorb | ||
private import semmle.code.java.frameworks.Jms | ||
|
@@ -51,13 +50,6 @@ | |
} | ||
} | ||
|
||
private class XmlDecoderReadObjectMethod extends Method { | ||
XmlDecoderReadObjectMethod() { | ||
this.getDeclaringType().hasQualifiedName("java.beans", "XMLDecoder") and | ||
this.hasName("readObject") | ||
} | ||
} | ||
|
||
private module SafeXStreamConfig implements DataFlow::ConfigSig { | ||
predicate isSource(DataFlow::Node src) { | ||
any(XStreamEnableWhiteListing ma).getQualifier().(VarAccess).getVariable().getAnAccess() = | ||
|
@@ -149,8 +141,15 @@ | |
|
||
private module SafeKryoFlow = DataFlow::Global<SafeKryoConfig>; | ||
|
||
private class DefaultUnsafeDeserializationSink extends DataFlow::Node { | ||
DefaultUnsafeDeserializationSink() { sinkNode(this, "unsafe-deserialization") } | ||
} | ||
|
||
/** | ||
* Holds if `ma` is a call that deserializes data from `sink`. | ||
* | ||
* Note that this does not include deserialization methods that have been | ||
* specified using models-as-data. | ||
*/ | ||
predicate unsafeDeserialization(MethodCall ma, Expr sink) { | ||
exists(Method m | m = ma.getMethod() | | ||
|
@@ -162,9 +161,6 @@ | |
sink = ma.getQualifier() and | ||
not DataFlow::exprNode(sink).getTypeBound() instanceof SafeObjectInputStreamType | ||
or | ||
m instanceof XmlDecoderReadObjectMethod and | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The class |
||
sink = ma.getQualifier() | ||
or | ||
m instanceof XStreamReadObjectMethod and | ||
sink = ma.getAnArgument() and | ||
not SafeXStreamFlow::flowToExpr(ma.getQualifier()) | ||
|
@@ -173,33 +169,13 @@ | |
sink = ma.getAnArgument() and | ||
not SafeKryoFlow::flowToExpr(ma.getQualifier()) | ||
or | ||
m instanceof MethodApacheSerializationUtilsDeserialize and | ||
sink = ma.getArgument(0) | ||
or | ||
ma instanceof UnsafeSnakeYamlParse and | ||
sink = ma.getArgument(0) | ||
or | ||
ma.getMethod() instanceof FastJsonParseMethod and | ||
not fastJsonLooksSafe() and | ||
sink = ma.getArgument(0) | ||
or | ||
ma.getMethod() instanceof JYamlLoaderUnsafeLoadMethod and | ||
sink = ma.getArgument(0) | ||
or | ||
ma.getMethod() instanceof JsonIoJsonToJavaMethod and | ||
sink = ma.getArgument(0) | ||
or | ||
ma.getMethod() instanceof JsonIoReadObjectMethod and | ||
sink = ma.getQualifier() | ||
or | ||
ma.getMethod() instanceof YamlBeansReaderReadMethod and sink = ma.getQualifier() | ||
or | ||
ma.getMethod() instanceof UnsafeHessianInputReadObjectMethod and sink = ma.getQualifier() | ||
or | ||
ma.getMethod() instanceof CastorUnmarshalMethod and sink = ma.getAnArgument() | ||
or | ||
ma.getMethod() instanceof BurlapInputReadObjectMethod and sink = ma.getQualifier() | ||
or | ||
ma.getMethod() instanceof ObjectMapperReadMethod and | ||
sink = ma.getArgument(0) and | ||
( | ||
|
@@ -215,9 +191,6 @@ | |
sink = ma.getArgument(2) and | ||
UnsafeTypeFlow::flowToExpr(ma.getArgument(1)) | ||
or | ||
m instanceof JabsorbFromJsonMethod and | ||
sink = ma.getArgument(0) | ||
or | ||
m instanceof JoddJsonParseMethod and | ||
sink = ma.getArgument(0) and | ||
( | ||
|
@@ -244,10 +217,17 @@ | |
|
||
/** A sink for unsafe deserialization. */ | ||
class UnsafeDeserializationSink extends ApiSinkNode, DataFlow::ExprNode { | ||
UnsafeDeserializationSink() { unsafeDeserialization(_, this.getExpr()) } | ||
MethodCall mc; | ||
|
||
UnsafeDeserializationSink() { | ||
unsafeDeserialization(mc, this.getExpr()) | ||
or | ||
this instanceof DefaultUnsafeDeserializationSink and | ||
this.getExpr() = [mc.getQualifier(), mc.getAnArgument()] | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. So this is because all the default sinks are are either There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes. If you can think of other things that should be added then let me know. It is possible that a customer in future will try to use some other feature of MaD ( There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think this future proof enough 😄 . |
||
} | ||
|
||
/** Gets a call that triggers unsafe deserialization. */ | ||
MethodCall getMethodCall() { unsafeDeserialization(result, this.getExpr()) } | ||
MethodCall getMethodCall() { result = mc } | ||
} | ||
|
||
/** Holds if `node` is a sanitizer for unsafe deserialization */ | ||
|
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It makes sense to use "" as the selector for the signature of
jsonToJava
to capture all overloads, but forreadObject
there are only a single method with signature()
- is this intentional (same question for other MaD sinks)?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not very familar with what to put in this column because I'm most familiar with Go, which doesn't allow overloads, so this column is always left blank. My thinking was that if there is only one overload then it is fine to use "" as it will match the one that we want. Is it better to specify the signature when there is only one overload? I note that the QL model doesn't specify anything except the name and declaring type, so it would match any new overloads that were added.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For C#, I think we try to only use "" if there are many overloads and where it is unlikely that another method will be introduced that behaves differently. However, I agree with you that the models as they are now more closely mirrors how the QL was defined - it makes sense to just keep it as it is.