W3docs

Java URL Class

Parse and represent URLs in Java with the URL class — protocol, host, port, path, query.

Java URL Class

A URL (Uniform Resource Locator) names a resource and how to reach it: https://www.example.com:443/docs?q=net#top. Java models it with java.net.URL, which both parses a URL into its components and can open a connection to fetch it. This chapter is about parsing and structure; opening connections is the next chapter.

The anatomy of a URL

  https://user:[email protected]:8443/docs/java?topic=net#intro
  └─┬─┘   └──┬──┘ └──────┬───────┘└┬─┘└───┬────┘ └────┬────┘└─┬─┘
 protocol  userInfo     host     port   path        query    ref

java.net.URL exposes each piece through a getter: getProtocol(), getUserInfo(), getHost(), getPort(), getPath(), getQuery(), and getRef() (the fragment). getPort() returns -1 when no port is written; getDefaultPort() gives the protocol's default (80 for http, 443 for https).

Building a URL

URL url = new URL("https://example.com/path?x=1");      // from a string
URL rel = new URL(base, "../images/logo.png");          // relative to a base

The two-arg constructor resolves a relative reference against a base URL — the same rule a browser uses for a link on a page. It is how you turn ../images/logo.png on a page at /a/b/page.html into the absolute /a/images/logo.png.

Prefer URI for parsing and validation

On modern JDKs the string URL constructors are deprecated. URL is now meant for opening connections, not for parsing or validating. For pure parsing, use java.net.URI and convert when you actually need to connect:

URI uri = URI.create("https://example.com/path?x=1");   // parse / validate
URL url = uri.toURL();                                   // only to open a connection

There is also a notorious trap to know about: URL.equals() and URL.hashCode() can perform DNS resolution to compare hosts, making them slow and network-dependent. Never put URL objects in a HashSet or use them as map keys — use URI, whose equality is purely textual.

A worked example: dissecting and resolving URLs

This program parses a fully-loaded URL into its parts, resolves a relative reference, and contrasts URL with the parsing-friendly URI — all offline, since parsing touches no network.

java— editable, runs on the server

What to take from the run:

  • Each getter pulled one labelled slice out of the single URL string: protocol, user-info, host, port, path, query, and the #intro fragment via getRef(). A URL is a structured value, not just text — parsing it once and reading fields beats hand-rolling String.split on / and ?.
  • The no-port URL reported getPort() == -1 while getDefaultPort() returned 80. The distinction matters when you build a connection: -1 means "the port was omitted," and you fall back to the protocol default rather than treating -1 as a real port.
  • base.resolve("../images/logo.png") produced an absolute URL by climbing out of /a/b/ — the exact relative-link resolution a browser performs. Relative references are resolved against a base, so the base's path determines where .. lands.
  • The example built the URL via URI.create(...).toURL() rather than the deprecated new URL(String). On modern JDKs that is the recommended path: parse and validate with URI, convert to URL only at the moment you open a connection.
  • URI.equals compared two identical URIs textually and returned true with no network involved. This is the safe way to compare and to use as map keys — URL.equals could have triggered a DNS lookup to resolve the host, which is slow and can fail offline.

Practice

Practice

A teammate stores downloaded resources in a 'HashSet<URL>' to skip duplicates, and notices the program is slow and sometimes hangs when offline. What is the correct explanation and fix?