Sei sulla pagina 1di 40

Django & The OWASP Top 10

Jarret Raim
Denim Group 2008
What is Django?
• “Django is a high-level Python Web
framework that encourages rapid
development and clean, pragmatic
design.”
• Developed by in-house programmers
for several news sites.
• Provides advanced functionality
(ORM) and full applications (Admin)
out of the box. • Django is named after Django
Reinhardt, a gypsy jazz
guitarist from the 1930s to
early 1950s.
• The ‘D’ is silent. It’s
pronounced ‘Jango’.
Overall Design Goals
• Loose Coupling
– The various layers of the framework shouldn’t “know” about each other unless
absolutely necessary.
• Less Code
– Django apps should use as little code as possible; they should lack boilerplate.
– Django should take full advantage of Python’s dynamic capabilities, such as
introspection.
• Quick Development
– The point of a Web framework in the 21st century is to make the tedious aspects of
Web development fast. Django should allow for incredibly quick Web development.
Overall Design Goals
• Don’t Repeat Yourself (DRY)
– Every distinct concept and/or piece of data should live in one, and only one, place.
Redundancy is bad. Normalization is good.
• Explicit Is Better Than Implicit
– Magic is worth using only if it creates a huge convenience unattainable in other
ways, and it isn’t implemented in a way that confuses developers who are trying to
learn how to use the feature.
• Consistency
– The framework should be consistent at all levels. Consistency applies to everything
from low-level (the Python coding style used) to high-level (the “experience” of
using Django).
Beautiful is better than ugly.
Explicit is better than implicit.
The Zen Of Python Simple is better than complex.
Complex is better than complicated.
Flat is better than nested.
• Python is a open source Sparse is better than dense.
dynamic programming Readability counts.
Special cases aren't special enough to break
language. the rules.
• Python can interact heavily with Although practicality beats purity.
Errors should never pass silently.
C based plugins for speed. Unless explicitly silenced.
• Python can be run on the CLR In the face of ambiguity, refuse the
temptation to guess.
(IronPython) or on the JVM There should be one-- and preferably only
one --obvious way to do it.
(Jython). Although that way may not be obvious at
first unless you're Dutch.
Now is better than never.
Although never is often better than *right*
now.
If the implementation is hard to explain,
it's a bad idea.
If the implementation is easy to explain, it
may be a good idea.
Namespaces are one honking great idea --
let's do more of those!
A Note About Tools
• While perfect IDE integration for
a dynamic language is possible,
the current options are not
great.
• Standard options available
– Eclipse
– Emacs
– Vim
• Several debuggers
– ActiveState
– Winpdb
• Komodo / Open Komodo
– Expensive
Django and MVC
• Django appears to be a MVC framework, but redefines some basic
terms.
– The Controller is the “view”
– The View the “template”
– Model stays the same
• In Django, the “view” describes the data that gets presented to the
user.
– It’s not necessarily how the data looks, but which data is presented.
– The view describes which data you see, not how you see it.
• Where does the “controller” fit in, then?
– In Django’s case, it’s probably the framework itself: the machinery that sends a
request to the appropriate view, according to the Django URL configuration.
• Django is a “MTV” framework
– “model”, “template”, and “view.”
Request Handling
1. Request is handled by
server (mod_python, etc).
2. URLConf routes request to
View.
3. View renders response.

• Each section can be


extended by middleware
layers.
• Example middleware
– CSRF Protection
– Authentication / Authorization
– Cache
– Transactions
URLConf
urlpatterns
urlpatterns == patterns('',
patterns('',
(r'^articles/(?P<year>\d{4})/(?P<month>\d{2})/$',
(r'^articles/(?P<year>\d{4})/(?P<month>\d{2})/$', views.month_archive),
views.month_archive),
(r'^foo/$',
(r'^foo/$', views.foobar_view,
views.foobar_view, {'template_name':
{'template_name': 'template1.html'}),
'template1.html'}),
(r'^mydata/birthday/$',
(r'^mydata/birthday/$', views.my_view,
views.my_view, {'month':
{'month': 'jan',
'jan', 'day':
'day': '06'}),
'06'}),
(r'^mydata/(?P<month>\w{3})/(?P<day>\d\d)/$',
(r'^mydata/(?P<month>\w{3})/(?P<day>\d\d)/$', views.my_view),
views.my_view),
(r'^admin/',
(r'^admin/', include('django.contrib.admin.urls')),
include('django.contrib.admin.urls')),
))

• A regex mapping between the URLs of your application and the views
that handle them.
• Can point to custom views or Django supplied views.
• Uses positional or named groups for parameter passing (?P<year>).
• Can override view parameters like template name.
• URLs can include ‘fake’ captured data.
URLConf Design Goals
• Loose Coupling
– URLs in a Django app should not be coupled to the underlying Python code.
– For example, one site may put stories at /stories/, while another may use /news/.
• Infinite Flexibility
– URLs should be as flexible as possible. Any conceivable URL design should be
allowed.
• Encourage Best Practices
– The framework should make it just as easy (or even easier) for a developer to
design pretty URLs than ugly ones.
– No file extensions or vignette-syle commas
• Definitive URLs
– Technically, foo.com/bar and foo.com/bar/ are two different URLs, and search-
engine robots (and some Web traffic-analyzing tools) would treat them as separate
pages. Django should make an effort to “normalize” URLs so that search-engine
robots don’t get confused.
Simple Views
• The job of the view is to build a ‘Context’ containing all data that will
be passed to the template.
• Views can return any data such as JSON objects, PDFs, streamed
data, etc.
def
def current_datetime(request):
current_datetime(request):
now
now == datetime.datetime.now()
datetime.datetime.now()
html
html == "<html><body>It
"<html><body>It is
is now
now %s.</body></html>"
%s.</body></html>" %% now
now
return
return HttpResponse(html)
HttpResponse(html)

def
def current_datetime(request):
current_datetime(request):
now
now == datetime.datetime.now()
datetime.datetime.now()
return
return render_to_response('current_datetime.html',
render_to_response('current_datetime.html',
{'current_date':
{'current_date': now})
now})
View Design Goals
• Simplicity
– Writing a view should be as simple as writing a Python function. Developers
shouldn’t have to instantiate a class when a function will do.
• Use Request Objects
– Views should have access to a request object — an object that stores metadata
about the current request. The object should be passed directly to a view function,
rather than the view function having to access the request data from a global
variable. This makes it light, clean and easy to test views by passing in “fake”
request objects.
• Loose Coupling
– A view shouldn’t care about which template system the developer uses — or even
whether a template system is used at all.
• Differentiate Between GET & Post
– GET and POST are distinct; developers should explicitly use one or the other. The
framework should make it easy to distinguish between GET and POST data.
Basic Templates
<!DOCTYPE
<!DOCTYPE HTML
HTML PUBLIC
PUBLIC "-//W3C//DTD
"-//W3C//DTD HTML
HTML 4.01//EN">
4.01//EN">
<html
<html lang="en">
lang="en">
<head>
<head>
<title>Future
<title>Future time</title>
time</title>
</head>
</head>
<body>
<body>
<h1>My
<h1>My helpful
helpful timestamp
timestamp site</h1>
site</h1>
<p>
<p>
In
In {{
{{ hour_offset
hour_offset }}
}} hour(s),
hour(s), it
it will
will be
be {{
{{ next_time
next_time }}.
}}.
</p>
</p>
<hr>
<hr>
<p>Thanks
<p>Thanks for
for visiting
visiting my
my site.</p>
site.</p>
</body>
</body> </html>
</html>
Template Design Goals
• Separate Logic from Presentation
– We see a template system as a tool that controls presentation and presentation-
related logic — and that’s it. The template system shouldn’t support functionality
that goes beyond this basic goal.
• Discourage Redundancy
– Support template inheritance to support DRY.
• Be Decoupled from HTML
– Template system should generate any data, not just HTML.
• XML should not be used for template languages
– Using an XML engine to parse templates introduces a whole new world of human
error in editing templates — and incurs an unacceptable level of overhead in
template processing.
• Assume Designer Competence
– Django expects template authors are comfortable editing HTML directly.
Template Design Goals, Part Deux
• Treat whitespace obviously
– Any whitespace that’s not in a template tag should be displayed. (No Magic)
• Don’t invent a programming language
– The template system intentionally doesn’t allow the following:
• Assignment to variables
• Advanced logic
– The Django template system recognizes that templates are most often written by
designers, not programmers, and therefore should not assume Python knowledge.
• Safety and security
– The template system, out of the box, should forbid the inclusion of malicious code
such as commands that delete database records.
• Extensibility
– The template system should recognize that advanced template authors may want to
extend its technology.
Template Examples
{%
{% for
for country
country inin countries
countries %} %}
{{ name|lower
{{ name|lower }} }}
<!DOCTYPE
<table>
<!DOCTYPE
<table> HTML
HTML PUBLIC
PUBLIC "-//W3C//DTD
"-//W3C//DTD HTMLHTML 4.01//EN">
4.01//EN">
{{ {%
{{ {% extends "base.html"
my_text|escape|linebreaks
extends "base.html" %} }}
my_text|escape|linebreaks %} }}
<html
{%
<htmlforlang="en">
{% for city
city inin country.city_list
lang="en"> country.city_list %} %}
{{
{{ bio|truncatewords:"30" }}
bio|truncatewords:"30" }}
<head>
<tr>
<head>
<tr>
{%
{% block
block title
title %}%}
<title>{%
<td>Country
<title>{%
<td>Countryblock
#{{
block#{{title
title %}{%
%}{% endblock
endblock %}</title>
forloop.parentloop.counter }}</td>
%}</title>
forloop.parentloop.counter }}</td>
{%
{% if The current time
today_is_weekend
if The current time %}
today_is_weekend %}
</head>
</head> <body>
<td>City
<body>
<td>City #{{
#{{ forloop.counter
forloop.counter }}</td>
}}</td>
{%
{% endblock
<p>Welcome
endblock to
<p>Welcome %}
%} the
to the weekend!</p>
weekend!</p>
<h1>My
<td>{{helpful
<td>{{
<h1>My city timestamp
timestamp site</h1>
city }}</td>
helpful }}</td> site</h1>
{%
{% else %} <p>Get back to work.</p> {%
else %} <p>Get back to work.</p> {% endif
endif %}
%}
{%
</tr> block content %}{% endblock
{% block content %}{% endblock %}
</tr> %}
{%
{% block
block content
content %} %}
{% {%
{% {% block
endfor
block%}
endfor
<p>It %}
is
footer
footer %} %}
{%
{% ifequal
ifequal is now
<p>It user
user now {{
{{ current_date
currentuser
currentuser %}
current_date
%} }}.</p>
}}.</p>
<hr>
</table>
<hr>
</table> <p>Thanks
<p>Thanks for
for visiting
visiting my
my site.</p>
site.</p>
{%
{% endblock
endblock %}
<h1>Welcome!</h1>
%}
<h1>Welcome!</h1>
{% {%
{% endblock
{% endfor
endfor %}
endblock
%} %}
%}
{% endifequal
{% endifequal %} %}
</body>
</body> </html>
</html>
Interacting with the Database: Models
• ORM definitions are simple Python objects.
• Additional field types with additional semantics.
• Generates database agnostic schemas.
• Removes boilerplate (surrogate keys).

class
class Author(models.Model):
Author(models.Model):
CREATE
CREATE TABLE
TABLE "books_author"
"books_author" ((
salutation
salutation = models.CharField(maxlength=10)
= models.CharField(maxlength=10)
"id"
"id" serial
serial NOT
NOT NULL
NULL PRIMARY
PRIMARY KEY,
KEY,
first_name
first_name == models.CharField(maxlength=30)
models.CharField(maxlength=30)
"salutation"
"salutation" varchar(10)
varchar(10) NOT
NOT NULL,
NULL,
last_name
last_name == models.CharField(maxlength=40)
models.CharField(maxlength=40)
"first_name"
"first_name" varchar(30)
varchar(30) NOT
NOT NULL,
NULL,
email = models.EmailField()
email = models.EmailField()
"last_name"
"last_name" varchar(40)
varchar(40) NOT
NOT NULL,
NULL,
headshot = models.ImageField(upload_to='/tmp')
headshot = models.ImageField(upload_to='/tmp')
"email"
"email" varchar(75)
varchar(75) NOT
NOT NULL,
NULL,
"headshot"
"headshot" varchar(100)
varchar(100) NOT
NOT NULL
NULL );
);
Basic Data Access
>>>
>>> from
from books.models
books.models import
import Publisher
Publisher
>>>
>>> p1
p1 == Publisher(name=‘X',
Publisher(name=‘X', address=Y',
address=Y',
...
... city='Boston',
city='Boston', state_province='MA',
state_province='MA', country='U.S.A.',
country='U.S.A.',
...
... website='http://www.apress.com/')
website='http://www.apress.com/')
>>>
>>> p1.save()
p1.save()
>>>
>>> publisher_list
publisher_list == Publisher.objects.all()
Publisher.objects.all()
>>>
>>> publisher_list
publisher_list

• Data access is lazy and accomplished as a set of Model methods


(think extension methods in .NET).
• Generates SQL injection safe statements.
Advanced Data Access
>>>
>>> Publisher.objects.filter(name="Apress
Publisher.objects.filter(name="Apress Publishing")
Publishing")
>>>
>>> Publisher.objects.filter(country=“USA",
Publisher.objects.filter(country=“USA", state_province="CA")
state_province="CA")
>>>
>>> Publisher.objects.filter(name__contains="press")
Publisher.objects.filter(name__contains="press")
>>>
>>> Publisher.objects.get(name="Penguin")
Publisher.objects.get(name="Penguin")
>>>
>>> Publisher.objects.order_by("name")
Publisher.objects.order_by("name")
>>>
>>> Publisher.objects.filter(country="U.S.A.").order_by("-name")
Publisher.objects.filter(country="U.S.A.").order_by("-name")
>>>
>>> Publisher.objects.all()[0]
Publisher.objects.all()[0]

• All queries are not executed until the query is ‘read’ by a list, count or
slice operation meaning that all queries can be chained.
• Each query object contains an internal cache to minimize database
accesses.
• Extension methods (‘name__contains’) exist for most database
operations like case (in)sensitive matching, >, <, in, startswith, etc.
Model Design Goals
• Explicit is Better than Implicit
– Fields shouldn’t assume certain behaviors based solely on the name of the field.
• Include all Relevant Domain Logic
– Models should encapsulate every aspect of an “object,” following Martin Fowler’s
Active Record design pattern.
• SQL Efficiency
– Execute SQL statements as few times as possible, and it should optimize
statements internally.
• Terse, Powerful Syntax
– The database API should allow rich, expressive statements in as little syntax as
possible. It should not rely on importing other modules or helper objects.
• Raw SQL When Needed
– The database API should realize it’s a shortcut but not necessarily an end-all-be-all.
The framework should make it easy to write custom SQL — entire statements, or
just custom WHERE clauses as custom parameters to API calls.
Administration Site

Django’s focus on removing the boilerplate work of web development led


to the creation of a full-featured, configurable administrative interface
to any Django app.
Form Processing
• Forms can be auto-generated just like models and can even be pulled
directly from a model class.
• Includes built in validation for fields like email and image.
• Rendering includes screen reader hints like <label> tags and can be
rendered in multiple ways to allow for custom CSS usage.

class
class ContactForm(forms.Form):
ContactForm(forms.Form):
<h1>Contact
<h1>Contact us</h1>
us</h1>
topic
topic
<form
<form == forms.ChoiceField(choices=TOPIC_CHOICES)
forms.ChoiceField(choices=TOPIC_CHOICES)
action="."
action="." method="POST">
method="POST">
message
<table> =={{
message
<table> {{forms.CharField()
forms.CharField()
form.as_table
form.as_table }}
}} </table>
</table>
sender
sender
<ul> {{== form.as_ul
<ul> {{ forms.EmailField(required=False)
forms.EmailField(required=False)
form.as_ul }}
}} </ul>
</ul>
<p>
<p> {{
{{ form.as_p
form.as_p }} }} </p>
</p>
TOPIC_CHOICES
TOPIC_CHOICES
<input == (( ('general',
('general',
<input type="submit"
type="submit" 'General enquiry'),
enquiry'), ('bug',
'General />
value="Submit
value="Submit /> ('bug', 'Bug
'Bug
report'),
report'), ('suggestion',
('suggestion', 'Suggestion'),
'Suggestion'), ))
</form>
</form>
Generic Views
publisher_info
publisher_info == {{ "queryset"
"queryset" :: Publisher.objects.all(),
Publisher.objects.all(), }}

urlpatterns
urlpatterns == patterns('',
patterns('', (r'^publishers/$',
(r'^publishers/$',
list_detail.object_list,
list_detail.object_list, publisher_info)
publisher_info) ))

{%
{% block
block content
content %}
%} <h2>Publishers</h2>
<h2>Publishers</h2>
<ul>
<ul> {%
{% for
for publisher
publisher in
in object_list
object_list %}
%}
<li>{{
<li>{{ publisher.name
publisher.name }}</li>
}}</li>
{%
{% endfor
endfor %}
%}
</ul>
</ul>
{%
{% endblock
endblock %}
%}

Because it’s such a common task, Django comes with a handful of built-
in generic views that make generating list and detail views of objects
incredibly easy.
Middleware
• Session Framework
– This session framework lets you store and retrieve arbitrary data on a per-site
visitor basis using encrypted cookies.
• Users & Authentication
– Handles user accounts, groups, permissions, and cookie-based user sessions.
– Access limits can be expressed in code, decorator methods or template language.
• Caching
– Implements several caching mechanisms for development and production.
• CSRF Protection
– Implements the random seeded form method for protecting from CRSF attacks.
• Transaction
– This middleware binds a database COMMIT or ROLLBACK to the request /
response phase. If a view function runs successfully, a COMMIT is issued. If the
view raises an exception, a ROLLBACK is issued.
OWASP Top 10

1. Cross Site Scripting (XSS)


2. Injection Flaws
3. Malicious File Execution
4. Insecure Direct Object Reference
5. Cross Site Request Forgery
6. Information Leakage & Improper Error Handling
7. Broken Authentication & Session Management
8. Insecure Cryptographic Storage
9. Insecure Communications
10. Failure to Restrict URL Access
URL Mapping as a Security Measure
urlpatterns
urlpatterns == patterns('',
patterns('',
(r'^articles/(?P<year>\d{4})/(?P<month>\d{2})/$',
(r'^articles/(?P<year>\d{4})/(?P<month>\d{2})/$', views.month_archive),
views.month_archive),
(r'^foo/$',
(r'^foo/$', views.foobar_view,
views.foobar_view, {'template_name':
{'template_name': 'template1.html'}),
'template1.html'}),
(r'^mydata/birthday/$',
(r'^mydata/birthday/$', views.my_view,
views.my_view, {'month':
{'month': 'jan',
'jan', 'day':
'day': '06'}),
'06'}),
(r'^mydata/(?P<month>\w{3})/(?P<day>\d\d)/$',
(r'^mydata/(?P<month>\w{3})/(?P<day>\d\d)/$', views.my_view),
views.my_view),
(r'^admin/',
(r'^admin/', include('django.contrib.admin.urls')),
include('django.contrib.admin.urls')),
))

• URL mappings are just regexs so they are your first line of defense.
• \d{4}
– Only allows combinations of numbers to be input, otherwise a 404 is issued.
– Valid: 2008, 3456, 1924, 2345
– Invalid: 23<script>alert(‘hi!’)</script>, -3454, 34, 1, 4h8g, #$#$$#ojif, etc.
– Still have to logically validate, but structural validation is easy.
• Helps prevent: Injection Flaws, XSS, CSRF, File Execution
Custom Data Typing
• CommaSeparatedIntegerField • Django adds custom fields that
• DateField / DateTimeField are not provided by SQL.
• EmailField • These fields provide validation
• FileField, FilePathField for model and form data.
• Users can define their own fields
• ImageField
with custom validation.
• IpField
• unique_for_* constraints
• PhoneNumber
• Validator lists for arbitrary fields.
• UrlField
• Enums are first class
• UsStateField

GENDER_CHOICES
GENDER_CHOICES == (( ('M',
('M', 'Male'),
'Male'), ('F',
('F', 'Female'),
'Female'), ))
Cross Site Scripting (XSS)

{{
{{ name|safe}}
name|safe}}

• All variables output by the


template engine are
escaped.
• Django provides a ‘safe’
filter to bypass the
protection.
Security – SQL Injection

• Django automatically escapes all special SQL parameters, according


to the quoting conventions of the database server you’re using (e.g.,
PostgreSQL or MySQL).
• Exception: The where argument to the extra() method. That parameter
accepts raw SQL by design.
• Exception: Queries done “by hand” using the lower-level database
API.
Malicious File Execution
• Standard Django practice is to
have a separate server for all
static media files.
• Prevents standard directory
traversal attacks (PHP).
• Provides ImageField for
standard user image
manipulation and validation.
• Custom storage backends can
be developed to have specific
validation behavior or custom
storage.
– Eg. Amazon S3, etc.
Insecure Direct Object Reference
• Django defines a special field called ‘slug’.
– In newspaper editing, a slug is a short name given to an article that is in
production.
– Examples: /blog/my-blog-entry, /home/this-is-a-slug
• Slugs allow for SEO friendly URLs & limit the usage of passing of
IDs as GET parameters.
• Views can declare the user / roles allowed to call.
• Permission to specific objects still needs to be checked by hand.

from
from django.contrib.auth.decorators
django.contrib.auth.decorators import
import user_passes_test
user_passes_test

@user_passes_test(lambda
@user_passes_test(lambda u:
u: u.has_perm('polls.can_vote'))
u.has_perm('polls.can_vote'))
def
def my_view(request):
my_view(request):
## ...
...
Cross Site Request Forgery

• Django provides included


middleware to implement
protection.
• Seeds each form with a
hash of the session ID
plus a secret key.
• Ensures that the same
hash is passed back with
each post.
• Django style guidelines
recommend that all GET
posts are idempotent.
Information Leakage & Improper Error Handling
• Django provides a DEBUG
setting in the settings.py file.
This prevents the framework
from outputting any sensitive
information.
• The ExceptionMiddleware
defines how the framework
handles exceptions.
• Custom ExceptionMiddleware
can be created to handle
exceptions.
Session Forging / Hijacking
• Django’s session framework doesn’t
allow sessions to be contained in the
URL.
– Unlike PHP, Java, etc.
• The only cookie that the session
framework uses is a single session ID;
all the session data is stored in the
database.
• Session IDs are stored as hashes
(instead of sequential numbers),
which prevents a brute-force attack.
• A user will always get a new session
ID if she tries a nonexistent one,
which prevents session fixation.
Insecure Cryptographic Storage
New Google library to allow for
easy encryption, Keyczar.

def
def _get_ssn(self):
_get_ssn(self):
enc_obj
enc_obj == Blowfish.new(
Blowfish.new( settings.SECRET_KEY
settings.SECRET_KEY ))
return
return u"%s"
u"%s" %%
enc_obj.decrypt(
enc_obj.decrypt( binascii.a2b_hex(self.social_security_number)
binascii.a2b_hex(self.social_security_number) ).rstrip()
).rstrip()
def
def _set_ssn(self,
_set_ssn(self, ssn_value):
ssn_value):
enc_obj
enc_obj == Blowfish.new(
Blowfish.new( settings.SECRET_KEY
settings.SECRET_KEY ))
repeat
repeat == 88 -- (len(
(len( ssn_value
ssn_value )) %% 8)
8) ssn_value
ssn_value == ssn_value
ssn_value ++ "" "" ** repeat
repeat
self.social_security_number
self.social_security_number = binascii.b2a_hex(enc_obj.encrypt( ssn_value ))
= binascii.b2a_hex(enc_obj.encrypt( ssn_value ))
ssn
ssn == property(_get_ssn,
property(_get_ssn, _set_ssn)
_set_ssn)

Simple to create transparent encrypted storage for model fields.


Insecure Communications
• The framework has little to do with insecure communications.
• Django uses the secure cookie protocol specified by Professor Alex X.
Liu of Michigan State University.
• Django authentication can be marked to use a secure cookie to force
SSL.

Failure to Restrict URL Access


• Links not specifically called out in the Urlconf don’t exist.
• Views can be tagged with roles & permissions.
• Static files loaded via forced browsing don’t exist in Django.
Deployment & Scaling
• Standard LAMP stack.
• Choice of mod_python,
mod_wsgi and fast-cgi for
request forwarding.
• Standard Django deployment
requires a separate static
media server.
• Scaling
– Offload database.
– Increase application servers.
– Increate media servers.
– Load balance.
• Memcached
Deployment & Scaling

Over 10 million / day served.


Massive Deployment
~ FIN ~

Potrebbero piacerti anche