公开访问 URL 抛出 IOException
Publicly accessible URL throwing IOException
我想访问 link http://www.nation.co.ke/business/seedsofgold/Egg-imports-from-Uganda-hatch-big-losses-for-farmers/-/2301238/2897930/-/dpeqesz/-/index.html
link 可公开访问,甚至可以使用 curl
加载
但在 Java 代码中它抛出 Exception in thread "main" java.io.IOException: Server returned HTTP response code: 403 for URL: http://www.nation.co.ke/business/seedsofgold/Egg-imports-from-Uganda-hatch-big-losses-for-farmers/-/2301238/2897930/-/dpeqesz/-/index.html
这是代码:
/**
*
* @param url the HTML page
* @throws IOException
*/
public static String getPage(String url) throws IOException {
URL u = new URL(url);
URLConnection conn = u.openConnection();
String mime = conn.getContentType();
if( !StringUtils.containsIgnoreCase(mime, "text/html") ) {
return null; // don't continue if not HTML
}
else {
// read the response body, using BufferedReader for performance
InputStream in = conn.getInputStream();
BufferedReader reader = new BufferedReader(new InputStreamReader(in, Charset.defaultCharset()));
int n = 0, totalRead = 0;
char[] buf = new char[1024];
StringBuilder content = new StringBuilder();
// read until EOF or first 16384 characters
while (totalRead < 16384 && (n = reader.read(buf, 0, buf.length)) != -1) {
content.append(buf, 0, n);
totalRead += n;
}
reader.close();
}
错误抛出在:
InputStream in = conn.getInputStream();
相同的代码适用于其他网址。
尝试添加
conn.setRequestProperty("User-Agent", "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.11 (KHTML, like Gecko) Chrome/23.0.1271.95 Safari/537.11");
在 URLConnection conn = u.openConnection();
之后立即连接到您的连接。许多网站在没有设置正确的代理时会阻止站点访问。
如果您收到 HTTP 403 状态代码,则表示出于某种原因禁止访问由 URL 标识的资源。
Web 服务器可能 return 403 Forbidden HTTP 状态代码以响应客户端对网页或资源的请求,以指示服务器可以访问并理解该请求,但拒绝采取任何进一步的行动。
我想访问 link http://www.nation.co.ke/business/seedsofgold/Egg-imports-from-Uganda-hatch-big-losses-for-farmers/-/2301238/2897930/-/dpeqesz/-/index.html
link 可公开访问,甚至可以使用 curl
但在 Java 代码中它抛出 Exception in thread "main" java.io.IOException: Server returned HTTP response code: 403 for URL: http://www.nation.co.ke/business/seedsofgold/Egg-imports-from-Uganda-hatch-big-losses-for-farmers/-/2301238/2897930/-/dpeqesz/-/index.html
这是代码:
/**
*
* @param url the HTML page
* @throws IOException
*/
public static String getPage(String url) throws IOException {
URL u = new URL(url);
URLConnection conn = u.openConnection();
String mime = conn.getContentType();
if( !StringUtils.containsIgnoreCase(mime, "text/html") ) {
return null; // don't continue if not HTML
}
else {
// read the response body, using BufferedReader for performance
InputStream in = conn.getInputStream();
BufferedReader reader = new BufferedReader(new InputStreamReader(in, Charset.defaultCharset()));
int n = 0, totalRead = 0;
char[] buf = new char[1024];
StringBuilder content = new StringBuilder();
// read until EOF or first 16384 characters
while (totalRead < 16384 && (n = reader.read(buf, 0, buf.length)) != -1) {
content.append(buf, 0, n);
totalRead += n;
}
reader.close();
}
错误抛出在:
InputStream in = conn.getInputStream();
相同的代码适用于其他网址。
尝试添加
conn.setRequestProperty("User-Agent", "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.11 (KHTML, like Gecko) Chrome/23.0.1271.95 Safari/537.11");
在 URLConnection conn = u.openConnection();
之后立即连接到您的连接。许多网站在没有设置正确的代理时会阻止站点访问。
如果您收到 HTTP 403 状态代码,则表示出于某种原因禁止访问由 URL 标识的资源。
Web 服务器可能 return 403 Forbidden HTTP 状态代码以响应客户端对网页或资源的请求,以指示服务器可以访问并理解该请求,但拒绝采取任何进一步的行动。