Java URL Class
Parse and represent URLs in Java with the URL class — protocol, host, port, path, query.
Java URL Class
A URL (Uniform Resource Locator) names a resource and how to reach it: https://www.example.com:443/docs?q=net#top. Java models it with java.net.URL, which both parses a URL into its components and can open a connection to fetch it. This chapter is about parsing and structure; opening connections is the next chapter.
The anatomy of a URL
https://user:[email protected]:8443/docs/java?topic=net#intro
└─┬─┘ └──┬──┘ └──────┬───────┘└┬─┘└───┬────┘ └────┬────┘└─┬─┘
protocol userInfo host port path query refjava.net.URL exposes each piece through a getter: getProtocol(), getUserInfo(), getHost(), getPort(), getPath(), getQuery(), and getRef() (the fragment). getPort() returns -1 when no port is written; getDefaultPort() gives the protocol's default (80 for http, 443 for https).
Building a URL
URL url = new URL("https://example.com/path?x=1"); // from a string
URL rel = new URL(base, "../images/logo.png"); // relative to a baseThe two-arg constructor resolves a relative reference against a base URL — the same rule a browser uses for a link on a page. It is how you turn ../images/logo.png on a page at /a/b/page.html into the absolute /a/images/logo.png.
Prefer URI for parsing and validation
On modern JDKs the string URL constructors are deprecated. URL is now meant for opening connections, not for parsing or validating. For pure parsing, use java.net.URI and convert when you actually need to connect:
URI uri = URI.create("https://example.com/path?x=1"); // parse / validate
URL url = uri.toURL(); // only to open a connectionThere is also a notorious trap to know about: URL.equals() and URL.hashCode() can perform DNS resolution to compare hosts, making them slow and network-dependent. Never put URL objects in a HashSet or use them as map keys — use URI, whose equality is purely textual.
A worked example: dissecting and resolving URLs
This program parses a fully-loaded URL into its parts, resolves a relative reference, and contrasts URL with the parsing-friendly URI — all offline, since parsing touches no network.
What to take from the run:
- Each getter pulled one labelled slice out of the single URL string: protocol, user-info, host, port, path, query, and the
#introfragment viagetRef(). A URL is a structured value, not just text — parsing it once and reading fields beats hand-rollingString.spliton/and?. - The no-port URL reported
getPort() == -1whilegetDefaultPort()returned80. The distinction matters when you build a connection:-1means "the port was omitted," and you fall back to the protocol default rather than treating-1as a real port. base.resolve("../images/logo.png")produced an absolute URL by climbing out of/a/b/— the exact relative-link resolution a browser performs. Relative references are resolved against a base, so the base's path determines where..lands.- The example built the
URLviaURI.create(...).toURL()rather than the deprecatednew URL(String). On modern JDKs that is the recommended path: parse and validate withURI, convert toURLonly at the moment you open a connection. URI.equalscompared two identical URIs textually and returnedtruewith no network involved. This is the safe way to compare and to use as map keys —URL.equalscould have triggered a DNS lookup to resolve the host, which is slow and can fail offline.
Practice
A teammate stores downloaded resources in a 'HashSet<URL>' to skip duplicates, and notices the program is slow and sometimes hangs when offline. What is the correct explanation and fix?