Clarified: get string from URL

  • manal
  • Born
  • Born
  • manal
  • Posts: 1

Post 3+ Months Ago

I need a C++ function that takes a URL as argument , and returns a string of the first 6 characters of THE WEBPAGE this URL directs to...
(ie if i apply this to http://www.ozzu.com, i should get back something like the first 6 characters of "OZZU WEBMASTER FORUM" ...the URL i'm using is however a lot less esthetically programmed)

Please help! :D
thanks!
manal
  • b_heyer
  • Web Master
  • Web Master
  • User avatar
  • Posts: 4581
  • Loc: Maryland

Post 3+ Months Ago

By first six characters do you mean including the http://?
  • dr nick
  • Proficient
  • Proficient
  • dr nick
  • Posts: 263
  • Loc: Frankfurt

Post 3+ Months Ago

Code: [ Select ]
void first6(unsigned char *six, unsigned char *url) {
  int i;
  int offset;

 /* with your offset, you can determine as of when you should copy 6 characters */
  offset = 0; // if you know your url starts with "http://", you could set this value to 7

  for (i = 0; i < 6; i++) {
   six[i] = url[i+offset];
  }

}
  1. void first6(unsigned char *six, unsigned char *url) {
  2.   int i;
  3.   int offset;
  4.  /* with your offset, you can determine as of when you should copy 6 characters */
  5.   offset = 0; // if you know your url starts with "http://", you could set this value to 7
  6.   for (i = 0; i < 6; i++) {
  7.    six[i] = url[i+offset];
  8.   }
  9. }


OK, this isn't great code, but so far from what you've asked it should provide a starting base. You'll probably want to refine it to test whether the url starts with http or something. Also, note that I didn't test whether the url is actually 6 characters long, so you will also want to test for that.
  • dr nick
  • Proficient
  • Proficient
  • dr nick
  • Posts: 263
  • Loc: Frankfurt

Post 3+ Months Ago

Wait a minute...are you actually asking for the title of the page for the url? i.e., given the url, you access the webpage, and then return the title of that webpage?
  • Bigwebmaster
  • Site Admin
  • Site Admin
  • User avatar
  • Posts: 9092
  • Loc: Seattle, WA & Phoenix, AZ

Post 3+ Months Ago

Well that would take a bit of work to create that function, since it is doing a few thing there. First it has to request the webpage URL. After that it needs to parse through the source of the page it requested to find what is in the title tag. Finally it would need to only use the first 6 characters of what is in the title tag. Anyway I am not going to write all that for you, but I can give you some parts that may help. You can probably use some of what Dr Nick gave you with this. Here is how you can request a URL in C++ and get the output of what you request:

Code: [ Select ]
  string serverReply;
  //connect to server
  try {
   ClientSocket clientSocket("www.yourdomain.com", 80, 2);

   try {
     clientSocket << "GET http://www.yourdomain.com/ HTTP/1.0\n";
     clientSocket << "Host: www.yourdomain.com\n\n";
     clientSocket >> serverReply;
   }
   catch(SocketException& e) {
     cout << "Exception was caught: " << e.description() << "<br>" << endl;
   }

   cout << "Response from server: " << serverReply << "<br><br>" << endl;

  }
  catch(SocketException& e) {
   cout << "Exception was caught: " << e.description() << "<br>" << endl;
  }
  1.   string serverReply;
  2.   //connect to server
  3.   try {
  4.    ClientSocket clientSocket("www.yourdomain.com", 80, 2);
  5.    try {
  6.      clientSocket << "GET http://www.yourdomain.com/ HTTP/1.0\n";
  7.      clientSocket << "Host: www.yourdomain.com\n\n";
  8.      clientSocket >> serverReply;
  9.    }
  10.    catch(SocketException& e) {
  11.      cout << "Exception was caught: " << e.description() << "<br>" << endl;
  12.    }
  13.    cout << "Response from server: " << serverReply << "<br><br>" << endl;
  14.   }
  15.   catch(SocketException& e) {
  16.    cout << "Exception was caught: " << e.description() << "<br>" << endl;
  17.   }


You will also need these classes / header files.

Here is clientsocket.h

Code: [ Select ]
#ifndef __clientsocket_h
#define __clientsocket_h  // Prevent multiple #includes

#include <sys/types.h>
#include <sys/socket.h>
#include <netinet/in.h>
#include <netdb.h>
#include <unistd.h>
#include <string>
#include <arpa/inet.h>
#include <string.h>
#include <errno.h>
#include <fcntl.h>
#include <sys/time.h>

const int MAXHOSTNAME = 200;
const int MAXCONNECTIONS = 5;
const int MAXRECV = 50000;

class Socket {
private:
  int m_sock;
  sockaddr_in m_addr;

public:
  Socket();
  virtual ~Socket();

  // Server initialization
  bool create();
  bool bind (const int port);
  bool listen() const;
  bool accept (Socket&) const;

  // Client initialization
  bool connect (const std::string host, const int port, const int timeOut);

  // Data Transimission
  bool send (const std::string) const;
  int recv (std::string&) const;

  void set_non_blocking (const bool);
  bool is_valid() const { return m_sock != -1; }

};


class ClientSocket : private Socket {
public:
  ClientSocket (std::string host, int port, int timeOut);
  virtual ~ClientSocket(){};

  const ClientSocket& operator << (const std::string&) const;
  const ClientSocket& operator >> (std::string&) const;
};


class SocketException {
private:
  std::string m_s;
public:
  SocketException(std::string s) : m_s(s) {};
  ~SocketException(){};

  std::string description() { return m_s; }
};


#endif // __clientsocket_h
  1. #ifndef __clientsocket_h
  2. #define __clientsocket_h  // Prevent multiple #includes
  3. #include <sys/types.h>
  4. #include <sys/socket.h>
  5. #include <netinet/in.h>
  6. #include <netdb.h>
  7. #include <unistd.h>
  8. #include <string>
  9. #include <arpa/inet.h>
  10. #include <string.h>
  11. #include <errno.h>
  12. #include <fcntl.h>
  13. #include <sys/time.h>
  14. const int MAXHOSTNAME = 200;
  15. const int MAXCONNECTIONS = 5;
  16. const int MAXRECV = 50000;
  17. class Socket {
  18. private:
  19.   int m_sock;
  20.   sockaddr_in m_addr;
  21. public:
  22.   Socket();
  23.   virtual ~Socket();
  24.   // Server initialization
  25.   bool create();
  26.   bool bind (const int port);
  27.   bool listen() const;
  28.   bool accept (Socket&) const;
  29.   // Client initialization
  30.   bool connect (const std::string host, const int port, const int timeOut);
  31.   // Data Transimission
  32.   bool send (const std::string) const;
  33.   int recv (std::string&) const;
  34.   void set_non_blocking (const bool);
  35.   bool is_valid() const { return m_sock != -1; }
  36. };
  37. class ClientSocket : private Socket {
  38. public:
  39.   ClientSocket (std::string host, int port, int timeOut);
  40.   virtual ~ClientSocket(){};
  41.   const ClientSocket& operator << (const std::string&) const;
  42.   const ClientSocket& operator >> (std::string&) const;
  43. };
  44. class SocketException {
  45. private:
  46.   std::string m_s;
  47. public:
  48.   SocketException(std::string s) : m_s(s) {};
  49.   ~SocketException(){};
  50.   std::string description() { return m_s; }
  51. };
  52. #endif // __clientsocket_h


Here is clientsocket.cpp

Code: [ Select ]
#include "clientsocket.h"

Socket::Socket() : m_sock (-1)
{
  memset (&m_addr, 0, sizeof(m_addr));
}

Socket::~Socket()
{
  if (is_valid())
   ::close (m_sock);
}

bool Socket::create()
{
  m_sock = socket(AF_INET, SOCK_STREAM, 0);

  if (! is_valid())
   return false;

  // TIME_WAIT - argh
  int on = 1;
  if (setsockopt(m_sock, SOL_SOCKET, SO_REUSEADDR, (const char*) &on, sizeof(on)) == -1)
   return false;

  return true;
}

bool Socket::bind (const int port)
{
  if (! is_valid())
   return false;

  m_addr.sin_family = AF_INET;
  m_addr.sin_addr.s_addr = INADDR_ANY;
  m_addr.sin_port = htons(port);

  int bind_return = ::bind(m_sock, (struct sockaddr *) &m_addr, sizeof(m_addr));

  if(bind_return == -1)
   return false;

  return true;
}


bool Socket::listen() const
{
  if(! is_valid())
   return false;

  int listen_return = ::listen(m_sock, MAXCONNECTIONS);

  if (listen_return == -1)
   return false;

  return true;
}

bool Socket::accept(Socket& new_socket) const
{
  int addr_length = sizeof(m_addr);
  new_socket.m_sock = ::accept (m_sock, (sockaddr *) &m_addr, (socklen_t *) &addr_length);

  if (new_socket.m_sock <= 0)
   return false;
  else
   return true;
}

bool Socket::send(const std::string s) const
{
  int status = ::send(m_sock, s.c_str(), s.size(), 0);

  if (status == -1)
   return false;
  else
   return true;
}


int Socket::recv(std::string& s) const
{
  char buf[MAXRECV + 1];
  s = "";

  memset(buf, 0, MAXRECV + 1);

  int status = ::recv(m_sock, buf, MAXRECV, 0);

  if (status == -1) {
//   std::cout << "status == -1  errno == " << errno << ", " << strerror(errno) << " in Socket::recv\n";
   return 0;
  }
  else if (status == 0)
   return 0;
  else {
   s = buf;
   return status;
  }
}



bool Socket::connect(const std::string host, const int port, const int timeOut)
{
  struct hostent *hp;
  struct timeval tv;
  fd_set writefds;
  tv.tv_sec = timeOut;
  tv.tv_usec = 500000;
     
  FD_ZERO(&writefds);
  FD_SET(m_sock, &writefds);

  if (! is_valid()) return false;

  if((hp = gethostbyname(host.c_str()))) {
   memset((char *) &m_addr, 0, sizeof(m_addr));
   memmove((char *) &m_addr.sin_addr, hp->h_addr, hp->h_length);
  }
  else return false;

  m_addr.sin_family = AF_INET;
  m_addr.sin_port = htons(port);

  if (errno == EAFNOSUPPORT) return false;

  ::connect(m_sock, (sockaddr *) &m_addr, sizeof (m_addr));

  select(m_sock+1, NULL, &writefds, NULL, &tv);

  if (FD_ISSET(m_sock, &writefds))
   return true;
  else
   return false;

}

void Socket::set_non_blocking(const bool b)
{
  int opts;

  opts = fcntl (m_sock, F_GETFL);

  if (opts < 0)
   return;

  if (b)
   opts = (opts | O_NONBLOCK);
  else
   opts = (opts & ~O_NONBLOCK);

  fcntl (m_sock, F_SETFL,opts);
}



ClientSocket::ClientSocket(std::string host, int port, int timeOut)
{
  if(! Socket::create())
   throw SocketException ( "Could not create client socket." );

  Socket::set_non_blocking(true);

  if(! Socket::connect (host, port, timeOut))
   throw SocketException ("Could not bind to port.");

  Socket::set_non_blocking(false);
}

const ClientSocket& ClientSocket::operator << (const std::string& s) const
{
  if (! Socket::send (s))
   throw SocketException ("Could not write to socket.");

  return *this;
}

const ClientSocket& ClientSocket::operator >> (std::string& s) const
{
  if (! Socket::recv(s))
   throw SocketException("Could not read from socket.");

  return *this;
}
  1. #include "clientsocket.h"
  2. Socket::Socket() : m_sock (-1)
  3. {
  4.   memset (&m_addr, 0, sizeof(m_addr));
  5. }
  6. Socket::~Socket()
  7. {
  8.   if (is_valid())
  9.    ::close (m_sock);
  10. }
  11. bool Socket::create()
  12. {
  13.   m_sock = socket(AF_INET, SOCK_STREAM, 0);
  14.   if (! is_valid())
  15.    return false;
  16.   // TIME_WAIT - argh
  17.   int on = 1;
  18.   if (setsockopt(m_sock, SOL_SOCKET, SO_REUSEADDR, (const char*) &on, sizeof(on)) == -1)
  19.    return false;
  20.   return true;
  21. }
  22. bool Socket::bind (const int port)
  23. {
  24.   if (! is_valid())
  25.    return false;
  26.   m_addr.sin_family = AF_INET;
  27.   m_addr.sin_addr.s_addr = INADDR_ANY;
  28.   m_addr.sin_port = htons(port);
  29.   int bind_return = ::bind(m_sock, (struct sockaddr *) &m_addr, sizeof(m_addr));
  30.   if(bind_return == -1)
  31.    return false;
  32.   return true;
  33. }
  34. bool Socket::listen() const
  35. {
  36.   if(! is_valid())
  37.    return false;
  38.   int listen_return = ::listen(m_sock, MAXCONNECTIONS);
  39.   if (listen_return == -1)
  40.    return false;
  41.   return true;
  42. }
  43. bool Socket::accept(Socket& new_socket) const
  44. {
  45.   int addr_length = sizeof(m_addr);
  46.   new_socket.m_sock = ::accept (m_sock, (sockaddr *) &m_addr, (socklen_t *) &addr_length);
  47.   if (new_socket.m_sock <= 0)
  48.    return false;
  49.   else
  50.    return true;
  51. }
  52. bool Socket::send(const std::string s) const
  53. {
  54.   int status = ::send(m_sock, s.c_str(), s.size(), 0);
  55.   if (status == -1)
  56.    return false;
  57.   else
  58.    return true;
  59. }
  60. int Socket::recv(std::string& s) const
  61. {
  62.   char buf[MAXRECV + 1];
  63.   s = "";
  64.   memset(buf, 0, MAXRECV + 1);
  65.   int status = ::recv(m_sock, buf, MAXRECV, 0);
  66.   if (status == -1) {
  67. //   std::cout << "status == -1  errno == " << errno << ", " << strerror(errno) << " in Socket::recv\n";
  68.    return 0;
  69.   }
  70.   else if (status == 0)
  71.    return 0;
  72.   else {
  73.    s = buf;
  74.    return status;
  75.   }
  76. }
  77. bool Socket::connect(const std::string host, const int port, const int timeOut)
  78. {
  79.   struct hostent *hp;
  80.   struct timeval tv;
  81.   fd_set writefds;
  82.   tv.tv_sec = timeOut;
  83.   tv.tv_usec = 500000;
  84.      
  85.   FD_ZERO(&writefds);
  86.   FD_SET(m_sock, &writefds);
  87.   if (! is_valid()) return false;
  88.   if((hp = gethostbyname(host.c_str()))) {
  89.    memset((char *) &m_addr, 0, sizeof(m_addr));
  90.    memmove((char *) &m_addr.sin_addr, hp->h_addr, hp->h_length);
  91.   }
  92.   else return false;
  93.   m_addr.sin_family = AF_INET;
  94.   m_addr.sin_port = htons(port);
  95.   if (errno == EAFNOSUPPORT) return false;
  96.   ::connect(m_sock, (sockaddr *) &m_addr, sizeof (m_addr));
  97.   select(m_sock+1, NULL, &writefds, NULL, &tv);
  98.   if (FD_ISSET(m_sock, &writefds))
  99.    return true;
  100.   else
  101.    return false;
  102. }
  103. void Socket::set_non_blocking(const bool b)
  104. {
  105.   int opts;
  106.   opts = fcntl (m_sock, F_GETFL);
  107.   if (opts < 0)
  108.    return;
  109.   if (b)
  110.    opts = (opts | O_NONBLOCK);
  111.   else
  112.    opts = (opts & ~O_NONBLOCK);
  113.   fcntl (m_sock, F_SETFL,opts);
  114. }
  115. ClientSocket::ClientSocket(std::string host, int port, int timeOut)
  116. {
  117.   if(! Socket::create())
  118.    throw SocketException ( "Could not create client socket." );
  119.   Socket::set_non_blocking(true);
  120.   if(! Socket::connect (host, port, timeOut))
  121.    throw SocketException ("Could not bind to port.");
  122.   Socket::set_non_blocking(false);
  123. }
  124. const ClientSocket& ClientSocket::operator << (const std::string& s) const
  125. {
  126.   if (! Socket::send (s))
  127.    throw SocketException ("Could not write to socket.");
  128.   return *this;
  129. }
  130. const ClientSocket& ClientSocket::operator >> (std::string& s) const
  131. {
  132.   if (! Socket::recv(s))
  133.    throw SocketException("Could not read from socket.");
  134.   return *this;
  135. }

Post Information

  • Total Posts in this topic: 5 posts
  • Users browsing this forum: No registered users and 116 guests
  • You cannot post new topics in this forum
  • You cannot reply to topics in this forum
  • You cannot edit your posts in this forum
  • You cannot delete your posts in this forum
  • You cannot post attachments in this forum
 
 

© 1998-2014. Ozzu® is a registered trademark of Unmelted, LLC.