ACCOUNT
PASSWORD
SOLUTIONS
SOLUTIONS
LIBRARY
LIBRARY
CONTACT
CONTACT
APPLICATIONS
APPLICATIONS
SOURCE
SOURCE

Nanometix® develops advanced software for network management, data processing and information security.

Our code enables easy management of your entire network infrastructure -- host configurations, cloud resources, web sites, firewalls, support tickets, DNS, e-mail, user credentials, intrusion protection, inventory control and more.

Unlike legacy systems like Plesk® or CPanel®, Nanometix® can securely manage any resource on your network. With Nanometix®, there are no agents to install, no licenses to buy and no security holes to worry about. Nanometix® is developed with pure ANSI-compliant compiled code -- you don't need PHP, PERL, SQL databases or anything else -- everything you need is contained in one compact binary. Nanometix® is thousands of times more efficient and secure than script-based software -- you can feel the performance the first time you use it.

Nanometix® is backed by enterprise-grade support and consulting.

WEB SERVER PERFORMANCE IMPROVEMENT TIPS

Try some of the following:

  • Turn on HTTP keepalives for external objects. Otherwise you add an extra round-trip to do another TCP three-way handshake and slow-start for every HTTP request. If you are worried about hitting global server connection limits, set the keepalive timeout to something short, like 5-10 seconds. Also look into serving your static content from a different webserver than your dynamic content. Having thousands of connections open to a stripped down static file webserver can happen in like 10 megs of RAM total, whereas your main webserver might easily eat 10 megs of RAM per connection.

  • Load fewer external objects. Due to request overhead, one bigger file just loads faster than two smaller ones half its size. Figure out how to globally reference the same one or two javascript files and one or two external stylesheets instead of many; if you have more, try preprocessing them when you publish them. If your UI uses dozens of tiny GIFs all over the place, consider switching to a much cleaner CSS-based design which probably won't need so many images. Or load all of your common UI images in one request using a technique called "CSS sprites".

  • If your users regularly load a dozen or more uncached or uncacheable objects per page, consider evenly spreading those objects over four hostnames. This usually means your users can have 4x as many outstanding connections to you. Without HTTP pipelining, this results in their average request latency dropping to about 1/4 of what it was before.

    When you generate a page, evenly spreading your images over four hostnames is most easily done with a hash function, like MD5. Rather than having all <img> tags load objects from http://static.example.com/, create four hostnames (e.g. static0.example.com, static1.example.com, static2.example.com, static3.example.com) and use two bits from an MD5 of the image path to choose which of the four hosts you reference in the <img> tag. Make sure all pages consistently reference the same hostname for the same image URL, or you'll end up defeating caching.

    Beware that each additional hostname adds the overhead of an extra DNS lookup and an extra TCP three-way handshake. If your users have pipelining enabled or a given page loads fewer than around a dozen objects, they will see no benefit from the increased concurrency and the site may actually load more slowly. The benefits only become apparent on pages with larger numbers of objects. Be sure to measure the difference seen by your users if you implement this.

  • Possibly the best thing you can do to speed up pages for repeat visitors is to allow static images, stylesheets, and javascript to be unconditionally cached by the browser. This won't help the first page load for a new user, but can substantially speed up subsequent ones.

    Set an Expires header on everything you can, with a date days or even months into the future. This tells the browser it is okay to not revalidate on every request, which can add latency of at least one round-trip per object per page load for no reason.

    Instead of relying on the browser to revalidate its cache, if you change an object, change its URL. One simple way to do this for static objects if you have staged pushes is to have the push process create a new directory named by the build number, and teach your site to always reference objects out of the current build's base URL. (Instead of <img src="http://example.com/logo.gif"> you'd use <img src="http://example.com/build/1234/logo.gif">. When you do another build next week, all references change to <img src="http://example.com/build/1235/logo.gif">.) This also nicely solves problems with browsers sometimes caching things longer than they should -- since the URL changed, they think it is a completely different object.

    If you conditionally gzip HTML, javascript, or CSS, you probably want to add a "Cache-Control: private" if you set an Expires header. This will prevent problems with caching by proxies that won't understand that your gzipped content can't be served to everyone. (The Vary header was designed to do this more elegantly, but you can't use it because of IE brokenness.)

    For anything where you always serve the exact same content when given the same URL (e.g. static images), add "Cache-Control: public" to give proxies explicit permission to cache the result and serve it to different users. If a caching proxy local to the user has the content, it is likely to have much less latency than you; why not let it serve your static objects if it has them?

    Avoid the use of query params in image URLs, etc. At least the Squid cache refuses to cache any URL containing a question mark by default. I've heard rumors that other things won't cache those URLs at all, but I don't have more information.

  • On pages where your users are often sent the exact same content over and over, such as your home page or RSS feeds, implementing conditional GETs can substantially improve response time and save server load and bandwidth in cases where the page hasn't changed.

    When serving a static files (including HTML) off of disk, most webservers will generate Last-Modified and/or ETag reply headers for you and make use of the corresponding If-Modified-Since and/or If-None-Match mechanisms on requests. But as soon as you add server-side includes, dynamic templating, or have code generating your content as it is served, you are usually on your own to implement these.

    The idea is pretty simple: When you generate a page, you give the browser a little extra information about exactly what was on the page you sent. When the browser asks for the same page again, it gives you this information back. If it matches what you were going to send, you know that the browser already has a copy and send a much smaller 304 (Not Modified) reply instead of the contents of the page again. And if you are clever about what information you include in an ETag, you can usually skip the most expensive database queries that would've gone into generating the page.

  • Minimize HTTP request size. Often cookies are set domain-wide, which means they are also unnecessarily sent by the browser with every image request from within that domain. What might've been a 400 byte request for an image could easily turn into 1000 bytes or more once you add the cookie headers. If you have a lot of uncached or uncacheable objects per page and big, domain-wide cookies, consider using a separate domain to host static content, and be sure to never set any cookies in it.

  • Minimize HTTP response size by enabling gzip compression for HTML and XML for browsers that support it. For example, the 17k document you are reading takes 90ms of the full downstream bandwidth of a user on 1.5Mbit DSL. Or it will take 37ms when compressed to 6.8k. That's 53ms off of the full page load time for a simple change. If your HTML is bigger and more redundant, you'll see an even greater improvement.

    If you are brave, you could also try to figure out which set of browsers will handle compressed Javascript properly. (Hint: IE4 through IE6 asks for its javascript compressed, then breaks badly if you send it that way.) Or look into Javascript obfuscators that strip out whitespace, comments, etc and usually get it down to 1/3 to 1/2 its original size.

  • Consider locating your small objects (or a mirror or cache of them) closer to your users in terms of network latency. For larger sites with a global reach, either use a commercial Content Delivery Network, or add a colo within 50ms of 8027744022120f your users and use one of the many available methods for routing user requests to your colo nearest them.

  • Regularly use your site from a realistic net connection. Convincing the web developers on my project to use a "slow proxy" that simulates bad DSL in New Zealand (768Kbit down, 128Kbit up, 250ms RTT, 1% packet loss) rather than the gig ethernet a few milliseconds from the servers in the U.S. was a huge win. We found and fixed a number of usability and functional problems very quickly.

    To implement the slow proxy, I used the netem and HTB kernel modules available in the Linux 2.6 kernel, both of which are set up with the tc command line tool. These offer the most accurate simulation I could find, but are definitely not for the faint of heart. I've not used them, but supposedly Tamper Data for Firefox, Fiddler for Windows, and Charles for OSX can all rate-limit and are probably easier to set up, but they may not simulate latency properly.

  • Use Firebug for Firefox from a realistic net connection to see a graphical timeline of what it is doing during a page load. This shows where Firefox has to wait for one HTTP request to complete before starting the next one and how page load time increases with each object loaded. YSlow extends Firebug to offer tips on how to improve your site's performance.

    The Safari team offers a tip on a hidden feature in their browser that offers some timing data too.

    Or if you are familiar with the HTTP protocol and TCP/IP at the packet level, you can watch what is going on using tcpdump, ngrep, or ethereal. These tools are indispensable for all sorts of network debugging.

  • Try benchmarking common pages on your site from a local network with ab, which comes with the Apache webserver. If your server is taking longer than 5 or 10 milliseconds to generate a page, you should make sure you have a good understanding of where it is spending its time.

    If your latencies are high and your webserver process (or CGI if you are using that) is eating a lot of CPU during this test, it is often a result of using a scripting language that needs to recompile your scripts with every request. Software like eAccelerator for PHP, mod_perl for perl, mod_python for python, etc can cache your scripts in a compiled state, dramatically speeding up your site. Beyond that, look at finding a profiler for your language that can tell you where you are spending your CPU. If you improve that, your pages will load faster and you'll be able to handle more traffic with fewer machines.

    If your site relies on doing a lot of database work or some other time-consuming task to generate the page, consider adding server-side caching of the slow operation. Most people start with writing a cache to local memory or local disk, but that starts to fall down if you expand to more than a few web server machines. Look into using memcached, which essentially creates an extremely fast shared cache that's the combined size of the spare RAM you give it off of all of your machines. It has clients available in most common languages.

  • (Optional) Petition browser vendors to turn on HTTP pipelining by default on new browsers. Doing so will remove some of the need for these tricks and make much of the web feel much faster for the average user. (Firefox has this disabled supposedly because some proxies, some load balancers, and some versions of IIS choke on pipelined requests. But Opera has found sufficient workarounds to enable pipelining by default. Why can't other browsers do similarly?)

ETHERNET CABLING

You can find bulk supplies of ethernet cable at many computer stores or most electrical or home centers. You want UTP (Unshielded Twisted Pair) ethernet cable of at least Category 5 (Cat 5). Cat 5 is required for basic 10/100 functionality, you will want Cat 5e for gigabit (1000BaseT) operation and Cat 6 or higher gives you a measure of future proofing. You can also use STP (Shielded Twisted Pair) for extra resistance to external interference but I won't cover shielded connectors. Bulk ethernet cable comes in many types, there are 2 basic categories, solid and braided stranded cable. Stranded ethernet cable tends to work better in patch applications for desktop use. It is more flexible and resilient than solid ethernet cable and easier to work with, but really meant for shorter lengths. Solid ethernet cable is meant for longer runs in a fixed position. Plenum rated ethernet cable must be used whenever the cable travels through an air circulation space. For example, above a false ceiling or below a raised floor. It may be difficult or impossible to tell from the package or labelling what type of ethernet cable it is, so peel out an end and investigate.

Inside the ethernet cable, there are 8 color coded wires. These wires are twisted into 4 pairs of wires, each pair has a common color theme. One wire in the pair being a solid or primarily solid colored wire and the other being a primarily white wire with a colored stripe (Sometimes ethernet cables won't have any color on the striped wire, the only way to tell which is which is to check which wire it is twisted around). Examples of the naming schemes used are: Orange (alternatively Orange/White) for the solid colored wire and White/Orange for the striped cable. The twists are extremely important. They are there to counteract noise and interference. It is important to wire according to a standard to get proper performance from the ethernet cable. The TIA/EIA-568-A specifies two wiring standards for an 8-position modular connector such as RJ45. The two wiring standards, T568A and T568B vary only in the arrangement of the colored pairs. Tom writes to say "...sources suggest using T568A cabling since T568B is the AT&T standard, but the US Government specifies T568A since it matches USOC cabling for pairs 1 & 2, which allows it to work for 1/2 line phones...". Your choice might be determined by the need to match existing wiring, jacks or personal preference, but you should maintain consistency.

The 8P8C modular connectors for Ethernet are often called RJ45 due to their physical resemblance. The plug is an 8-position modular connector that looks like a large phone plug. There are a couple variations available. The primary variation you need to pay attention to is whether the connector is intended for braided or solid wire. For braided/stranded wires, the connector has sharp pointed contacts that actually pierce the wire. For solid wires, the connector has fingers which cut through the insulation and make contact with the wire by grasping it from both sides. The connector is the weak point in an ethernet cable, choosing the wrong one will often cause grief later. If you just walk into a computer store, it's nearly impossible to tell what type of plug it is. You may be able to determine what type it is by crimping one without a cable.

Modular connector jacks come in a variety styles intended for several different mounting options. The choice is one of requirements and preference. Jacks are designed to work only with solid ethernet cable. Most jacks come labeled with color coded wiring diagrams for either T568A, T568B or both. Make sure you end up with the correct one.

There are two basic ethernet cable pin outs. A straight through ethernet cable, which is used to connect to a hub or switch, and a crossover ethernet cable used to operate in a peer-to-peer fashion without a hub/switch. Generally all fixed wiring should be run as straight through. Some ethernet interfaces can cross and un-cross a cable automatically as needed, a handy feature.

If an ethernet cable tester is available, use it to verify the proper connectivity of the cable. That should be it, if your ethernet cable doesn't turn out, look closely at each end and see if you can find the problem. Often a wire ended up in the wrong place or one of the wires is making no contact or poor contact. Also double check the color coding to verify it is correct. If you see a mistake or problem, cut the end off and start again. A ethernet cable tester is invaluable at identifying and highlighting these issues.

When sizing ethernet cables remember that an end to end connection should not extend more than 100m (~328ft). Try to minimize the ethernet cable length, the longer the cable becomes, the more it may affect performance. This is usually noticeable as a gradual decrease in speed and increase in latency.

Power over Ethernet has been implemented in many variations before IEEE standardized 802.3af. 802.3af specifies the ability to supply an endpoint with 48V DC at up 350mA or 16.8W. The endpoint must be capable of receiving power on either the data pairs [Mode A] (often called phantom power) or the unused pairs [Mode B] in 100Base-TX. PoE can be used with any ethernet configuration, including 10Base-T, 100Base-TX and 1000Base-T. Power is only supplied when a valid PoE endpoint is detected by using a low voltage probe to look for the PoE signature on the endpoint. PoE power is typically supplied in one of two ways, either the host ethernet switch provides the power, or a "midspan" device is plugged in between the switch and endpoints and supplies the power. No special cabling is required.

THE DOMAIN NAME SYSTEM

The Domain Name System (DNS) is a hierarchical naming system for computers, services, or any resource participating in the Internet. It associates various information with domain names assigned to such participants. Most importantly, it translates domain names meaningful to humans into the numerical (binary) identifiers associated with networking equipment for the purpose of locating and addressing these devices world-wide. An often used analogy to explain the Domain Name System is that it serves as the "phone book" for the Internet by translating human-friendly computer hostnames into IP addresses. For example, www.example.com translates to 208.77.188.166.

The Domain Name System makes it possible to assign domain names to groups of Internet users in a meaningful way, independent of each user's physical location. Because of this, World-Wide Web (WWW) hyperlinks and Internet contact information can remain consistent and constant even if the current Internet routing arrangements change or the participant uses a mobile device. Internet domain names are easier to remember than IP addresses such as 208.77.188.166 (IPv4) or 2001:db8:1f70::999:de8:7648:6e8 (IPv6). People take advantage of this when they recite meaningful URLs and e-mail addresses without having to know how the machine will actually locate them.

The Domain Name System distributes the responsibility of assigning domain names and mapping those names to IP addresses by designating authoritative name servers for each domain. Authoritative name servers are assigned to be responsible for their particular domains, and in turn can assign other authoritative name servers for their sub-domains. This mechanism has made the DNS distributed, fault tolerant, and helped avoid the need for a single central register to be continually consulted and updated.

In general, the Domain Name System also stores other types of information, such as the list of mail servers that accept email for a given Internet domain. By providing a world-wide, distributed keyword-based redirection service, the Domain Name System is an essential component of the functionality of the Internet.

Other identifiers such as RFID tags, UPC codes, International characters in email addresses and host names, and a variety of other identifiers could all potentially utilize DNS.

The Domain Name System also defines the technical underpinnings of the functionality of this database service. For this purpose it defines the DNS protocol, a detailed specification of the data structures and communication exchanges used in DNS, as part of the Internet Protocol Suite (TCP/IP). The DNS protocol was developed and defined in the early 1980s and published by the Internet Engineering Task Force.

The practice of using a name as a more human-legible abstraction of a machine's numerical address on the network predates even TCP/IP. This practice dates back to the ARPAnet era. Back then, a different system was used. The DNS was invented in 1983, shortly after TCP/IP was deployed. With the older system, each computer on the network retrieved a file called HOSTS.TXT from a computer at SRI (now SRI International). The HOSTS.TXT file mapped numerical addresses to names. A hosts file still exists on most modern operating systems, either by default or through configuration, and allows users to specify an IP address (eg. 208.77.188.166) to use for a hostname (eg. www.example.net) without checking DNS. Systems based on a hosts file have inherent limitations, because of the obvious requirement that every time a given computer's address changed, every computer that seeks to communicate with it would need an update to its hosts file.

The growth of networking required a more scalable system that recorded a change in a host's address in one place only. Other hosts would learn about the change dynamically through a notification system, thus completing a globally accessible network of all hosts' names and their associated IP Addresses.

At the request of Jon Postel, Paul Mockapetris invented the Domain Name System in 1983 and wrote the first implementation. The original specifications appear in RFC 882 and RFC 883. In November 1987, the publication of RFC 1034 and RFC 1035 updated the DNS specification and made RFC 882 and RFC 883 obsolete. Several more-recent RFCs have proposed various extensions to the core DNS protocols.

In 1984, four Berkeley students.Douglas Terry, Mark Painter, David Riggle and Songnian Zhou.wrote the first UNIX implementation, which was maintained by Ralph Campbell thereafter. In 1985, Kevin Dunlap of DEC significantly re-wrote the DNS implementation and renamed it BIND : Berkeley Internet Name Domain. Mike Karels, Phil Almquist and Paul Vixie have maintained BIND since then. BIND was ported to the Windows NT platform in the early 1990s.

BIND was widely distributed, especially on Unix systems, and is the dominant DNS software in use on the Internet. With the heavy use and resulting scrutiny of its open-source code, as well as increasingly more sophisticated attack methods, many security flaws were discovered in BIND. This contributed to the development of a number of alternative nameserver and resolver programs. BIND itself was re-written from scratch in version 9, which has a security record comparable to other modern Internet software.

The domain name space consists of a tree of domain names. Each node or leaf in the tree has zero or more resource records, which hold information associated with the domain name. The tree sub-divides into zones beginning at the root zone. A DNS zone consists of a collection of connected nodes authoritatively served by an authoritative nameserver. (Note that a single nameserver can host several zones.)

Administrative responsibility over any zone may be divided, thereby creating additional zones. Authority is said to be delegated for a portion of the old space, usually in form of sub-domains, to another nameserver and administrative entity. The old zone ceases to be authoritative for the new zone.

A domain name usually consists of two or more parts (technically labels), which are conventionally written separated by dots, such as example.com.

The rightmost label conveys the top-level domain (for example, the address www.example.com has the top-level domain com).

Each label to the left specifies a subdivision, or subdomain of the domain above it. Note: .subdomain. expresses relative dependence, not absolute dependence. For example: example.com is a subdomain of the com domain, and www.example.com is a subdomain of the domain example.com. In theory, this subdivision can go down 127 levels. Each label can contain up to 63 octets. The whole domain name may not exceed a total length of 253 octets. In practice, some domain registries may have shorter limits.

A hostname refers to a domain name that has one or more associated IP addresses; ie: the 'www.example.com' and 'example.com' domains are both hostnames, however, the 'com' domain is not.

The Domain Name System is maintained by a distributed database system, which uses the client-server model. The nodes of this database are the name servers. Each domain or subdomain has one or more authoritative DNS servers that publish information about that domain and the name servers of any domains subordinate to it. The top of the hierarchy is served by the root nameservers: the servers to query when looking up (resolving) a top-level domain name (TLD).

The client-side of the DNS is called a DNS resolver. It is responsible for initiating and sequencing the queries that ultimately lead to a full resolution (translation) of the resource sought, e.g., translation of a domain name into an IP address.

A DNS query may be either a recursive query or a non-recursive query.

A non-recursive query is one in which the DNS server may provide a partial answer to the query (or give an error). A recursive query is one where the DNS server will fully answer the query (or give an error). DNS servers are not required to support recursive queries.

The resolver (or another DNS server acting recursively on behalf of the resolver) negotiates use of recursive service using bits in the query headers.

Resolving usually entails iterating through several name servers to find the needed information. However, some resolvers function simplistically and can communicate only with a single name server. These simple resolvers rely on a recursive query to a recursive name server to perform the work of finding information for them.

In theory a full host name may have several name segments, (e.g ahost.ofasubnet.ofabiggernet.inadomain.example). In practice, full host names will frequently consist of just three segments (ahost.inadomain.example, and most often www.inadomain.example). For querying purposes, software interprets the name segment by segment, from right to left. At each step along the way, the program queries a corresponding DNS server to provide a pointer to the next server which it should consult. A DNS recursor consults three nameservers to resolve the address www.wikipedia.org.

The mechanism in this simple form has a difficulty: it places a huge operating burden on the root servers, with every search for an address starting by querying one of them. Being as critical as they are to the overall function of the system, such heavy use would create an insurmountable bottleneck for trillions of queries placed every day. In practice caching is used to overcome this problem, and in actual fact root nameservers deal with very little of the total traffic.

Name servers in delegations appear listed by name, rather than by IP address. This means that a resolving name server must issue another DNS request to find out the IP address of the server to which it has been referred. Since this can introduce a circular dependency if the nameserver referred to is under the domain that it is authoritative of, it is occasionally necessary for the nameserver providing the delegation to also provide the IP address of the next nameserver. This record is called a glue record.

DNS also supports wildcard DNS records that will match requests for non-existent domain names. A wildcard DNS record is specified by using a "*" as the left most label (part) of a domain name, e.g. *.example.com. The exact rules for when a wild card will match are specified in RFC 1034, but the rules are neither intuitive nor clearly specified. This has resulted in incompatible implementations and unexpected results when they are used.

Because of the huge volume of requests generated by a system like DNS, the designers wished to provide a mechanism to reduce the load on individual DNS servers. To this end, the DNS resolution process allows for caching (i.e. the local recording and subsequent consultation of the results of a DNS query) for a given period of time after a successful answer. How long a resolver caches a DNS response (i.e. how long a DNS response remains valid) is determined by a value called the time to live (TTL). The TTL is set by the administrator of the DNS server handing out the response. The period of validity may vary from just seconds to days or even weeks.

As a noteworthy consequence of this distributed and caching architecture, changes to DNS do not always take effect immediately and globally. This is best explained with an example: If an administrator has set a TTL of 6 hours for the host www.wikipedia.org, and then changes the IP address to which www.wikipedia.org resolves at 12:01pm, the administrator must consider that a person who cached a response with the old IP address at 12:00 noon will not consult the DNS server again until 6:00pm. The period between 12:01pm and 6:00pm in this example is called caching time, which is best defined as a period of time that begins when you make a change to a DNS record and ends after the maximum amount of time specified by the TTL expires. This essentially leads to an important logistical consideration when making changes to DNS: not everyone is necessarily seeing the same thing you're seeing. RFC 1912 helps to convey basic rules for how to set the TTL.