[Planetlab-users] sendto: operation not permitted
Patrick Verkaik
pverkaik at cs.ucsd.edu
Sun Nov 12 21:40:32 EST 2006
Hi Neil,
I believe we're doing everything according to the specs. I condensed the
connecting side of our application into a small example program (attached)
that reproduces the behaviour. I would appreciate if you or someone could
take a quick look at it.
The program binds a socket to a local port and (through a separate raw
socket) sends a SYN and then immediately an ACK packet to REMOTE_ADDR
(132.239.17.226) port REMOTE_TCP_PORT (33445). For both packets if it
encounters EPERM it sleeps a second and retries (and keeps doing so until
EPERM disappears).
This is the tcpdump output when the remote host is allowed to send back a
SYN/ACK:
20:53:18.068949 IP 132.239.17.224.50227 > 132.239.17.226.33445: S 10:10(0) win 20502
20:53:18.079683 IP 132.239.17.226.33445 > 132.239.17.224.50227: S 905277696:905277696(0) ack 11 win 5712 <mss 1428>
20:53:19.071081 IP 132.239.17.224.50227 > 132.239.17.226.33445: . ack 3389689600 win 20502
(The program gets one EPERM since its first attempt at sending ACK
precedes the SYN/ACK from the remote host.)
Tcpdump output when the remote host sends back RST instead of SYN/ACK:
21:05:26.991199 IP 132.239.17.224.52034 > 132.239.17.226.33445: S 10:10(0) win 20502
21:05:26.991347 IP 132.239.17.226.33445 > 132.239.17.224.52034: R 0:0(0) ack 11 win 0
21:05:26.991979 IP 132.239.17.224.52034 > 132.239.17.226.33445: . ack 0 win 20502
So far so good. However, if I suppress the SYN/ACK from the remote host I
get this:
21:09:03.005807 IP 132.239.17.224.52554 > 132.239.17.226.33445: S 10:10(0) win 20502
21:11:03.555572 IP 132.239.17.224.52554 > 132.239.17.226.33445: . ack 0 win 20502
For a full two minutes the program repeatedly gets EPERM as it tries to
send the ACK. After the two minutes have passed the ACK finally goes
through.
In the above the program is running on planetlab1.ucsd.edu. When run on
planet-lab2.cs.ucr.edu it shows exactly the same behaviour.
Can you confirm that we're doing everything the way we should?
Thanks,
Patrick
On Thu, 9 Nov 2006, Neil Spring wrote:
> Patrick,
>
> PlanetLab's vnet works on the assumption that you can send tcp packets so
> long as the source port is one your slice "owns" via opening a socket and
> calling bind, and that you can only receive tcp packets to a destination port
> that you "own" in the same way.
>
> We send gobs of tcp packets, without EPERM on sendto. If it's letting a
> packet through after 100,000 retries, that sounds like a bug.
>
> Planetlab rate limiting has been found to delay packets by 30 seconds (I'm as
> surprised as anyone, but I heard this from a reputable source on sunday).
> I'd guess that if you were really blowing the queue, you'd get ENOBUFF rather
> than EPERM.
>
> I'm pretty sure Mark wrote a vnet faq, or at least the documentation should
> get you pointed in the right direction.
>
> -neil
>
> On Nov 9, 2006, at 2:58 AM, Patrick Verkaik wrote:
>
>>
>> Hi,
>>
>> I have a question about the following sendto() behaviour that I don't quite
>> understand.
>>
>> We're sending raw TCP packets using sendto() but getting intermittent EPERM
>> errors. We've found that repeatedly retrying the sendto() (with exactly the
>> same packet) until it no longer gives EPERM eventually gets the packet
>> through. During a short connection (perhaps lasting 10 seconds and sending
>> across less than 100 bytes of data) we sometimes see about 100,000 failed
>> sendto() attempts (with a usleep(1) separating the attempts roughly 20,000
>> attempts).
>>
>> Another curious fact is that we only see this behaviour when we're
>> tunneling the reverse TCP traffic into the sending host. The host therefore
>> doesn't see e.g. SYN/ACK packets coming back in response to outgoing SYNs.
>>
>> Can anyone explain this? Is this how Planetlab implements rate limiting or
>> prevents SYN-flooding attacks being launched from Planetlab?
>>
>> (Btw: we're running as root and using the node's IP address as source IP
>> address.)
>>
>> Thanks,
>>
>> Patrick Verkaik
>> Barath Raghavan
>>
>> _______________________________________________
>> Users mailing list: Users at lists.planet-lab.org
>> https://lists.planet-lab.org/mailman/listinfo/users
>
> _______________________________________________
> Users mailing list: Users at lists.planet-lab.org
> https://lists.planet-lab.org/mailman/listinfo/users
--
Patrick
-------------- next part --------------
#include <stdio.h>
#include <stdlib.h>
#include <errno.h>
#include <assert.h>
#include <sys/socket.h>
#include <netinet/in.h>
#include <arpa/inet.h>
#include <netinet/ip.h>
#include <netinet/tcp.h>
#undef NDEBUG
// customise
#define REMOTE_TCP_PORT 33445
#define LOCAL_ADDR ("132.239.17.224")
#define REMOTE_ADDR ("132.239.17.226")
struct tcppseudo
{
u_int32_t saddr;
u_int32_t daddr;
u_int16_t protocol;
u_int16_t tcp_len;
};
u_int16_t
ip_cksum(u_int16_t *buf, size_t count, u_int16_t *buf2, size_t count2)
{
/* Code adapted from RFC 1071. */
register u_int32_t sum = 0;
while( count > 1 ) {
/* This is the inner loop */
sum += *(buf++);
count -= 2;
}
while( count2 > 1 ) {
/* This is the inner loop */
sum += *(buf2++);
count2 -= 2;
}
/* Add left-over byte, if any */
if( count > 0 )
sum += * (unsigned char *) buf;
/* Fold 32-bit sum to 16 bits */
while (sum>>16)
sum = (sum & 0xffff) + (sum >> 16);
return ~sum & (u_int32_t) 0xffff;
}
int
send_raw(int rawsock, char *ip_buf)
{
struct iphdr *iphdr = (struct iphdr *) ip_buf;
unsigned pktlen = ntohs (iphdr->tot_len);
assert (iphdr->protocol == IPPROTO_TCP);
unsigned iphdrlen = iphdr->ihl * 4;
struct tcphdr *tcphdr = (struct tcphdr *) &ip_buf[iphdrlen];
char *l4_buf = ip_buf + iphdr->ihl * 4;
unsigned l4_len = ntohs (iphdr->tot_len)-iphdr->ihl * 4;
tcphdr->check = 0;
struct tcppseudo tcppseudo;
tcppseudo.saddr = iphdr->saddr;
tcppseudo.daddr = iphdr->daddr;
tcppseudo.protocol = htons(IPPROTO_TCP);
tcppseudo.tcp_len = htons(l4_len);
tcphdr->check = ip_cksum((u_int16_t *) l4_buf, l4_len,
(u_int16_t *) &tcppseudo, sizeof(struct tcppseudo));
iphdr->check = 0;
iphdr->check = ip_cksum((u_int16_t *) iphdr, sizeof(struct iphdr), 0, 0);
struct sockaddr_in dest_addr;
socklen_t dest_addr_len = sizeof(struct sockaddr_in);
bzero(&dest_addr, dest_addr_len);
dest_addr.sin_family = AF_INET;
dest_addr.sin_addr.s_addr = iphdr->daddr;
dest_addr.sin_port = tcphdr->dest;
ssize_t sent;
sent = sendto(rawsock, ip_buf, pktlen, 0, (struct sockaddr *) &dest_addr,
dest_addr_len);
if (sent != pktlen) {
if (sent < 0) {
perror("sendto");
}
else
fprintf(stderr, "sendto returned %d != %u bytes\n", sent, pktlen);
return 0;
}
return 1;
}
int
allocate_raw_tcp_port(u_int32_t saddr)
{
struct sockaddr_in tcp_sin;
memset(&tcp_sin, 0, sizeof(tcp_sin));
tcp_sin.sin_addr.s_addr = saddr;
tcp_sin.sin_port = 0;
// discover a free port
int tcp_sock = socket(PF_INET, SOCK_STREAM, 0);
if (tcp_sock == -1) {
perror("allocate_raw_tcp_port: creating TCP socket");
exit(1);
}
if(bind(tcp_sock, (struct sockaddr*) &tcp_sin,
sizeof(struct sockaddr_in)) < 0) {
perror("allocate_raw_tcp_port: binding TCP port");
return -1;
}
struct sockaddr_in sa;
socklen_t sa_len = sizeof(struct sockaddr_in);
if (getsockname(tcp_sock, (struct sockaddr *) &sa, (socklen_t *) &sa_len) < 0) {
perror("allocate_raw_tcp_port: getsockname");
return -1;
}
tcp_sin = sa;
fprintf(stderr, "allocate_raw_tcp_port: briefly bound tcp socket on %s, %u\n",
inet_ntoa (tcp_sin.sin_addr),
(unsigned) ntohs(tcp_sin.sin_port));
// close port so that we can now bind a raw socket to it. note: same
// weird behaviour occurs when we don't bind a raw socket and just go with
// tcp_sock
close(tcp_sock); // XXX may lose the port.
struct sockaddr_in raw_sin = tcp_sin;
fprintf(stderr, "allocate_raw_tcp_port: binding raw on %s, %u\n",
inet_ntoa (raw_sin.sin_addr),
(unsigned) ntohs(raw_sin.sin_port));
int tcp_raw_sock = socket(PF_INET, SOCK_RAW, IPPROTO_TCP);
if (tcp_raw_sock == -1) {
perror("allocate_raw_tcp_port: creating TCP raw socket");
exit(1);
}
if (bind(tcp_raw_sock, (struct sockaddr *) &raw_sin,
sizeof(struct sockaddr_in)) < 0) {
perror("allocate_raw_tcp_port: binding raw socket");
return -1;
}
sa_len = sizeof(struct sockaddr_in);
if (getsockname(tcp_raw_sock, (struct sockaddr *) &sa, (socklen_t *) &sa_len) < 0) {
perror("allocate_raw_tcp_port: getsockname");
return -1;
}
raw_sin = sa;
fprintf(stderr, "allocate_raw_tcp_port: bound tcp raw socket on %s, %u\n",
inet_ntoa (raw_sin.sin_addr),
(unsigned) ntohs(raw_sin.sin_port));
return htons(raw_sin.sin_port);
}
int
main(int argc, char **argv)
{
unsigned pkt_len = sizeof(struct iphdr) + sizeof(struct tcphdr);
char ip_buf[pkt_len+1000];
u_int32_t saddr;
u_int32_t daddr;
if (! inet_aton(LOCAL_ADDR, (struct in_addr*) &saddr)) {
fprintf(stderr, "inet_aton error\n");
exit(1);
}
if (! inet_aton(REMOTE_ADDR, (struct in_addr*) &daddr)) {
fprintf(stderr, "inet_aton error\n");
exit(1);
}
int rawsock;
if((rawsock = socket(PF_INET, SOCK_RAW, IPPROTO_TCP)) < 0) {
perror("socket");
exit(1);
}
int tmp = 1;
if (setsockopt(rawsock, 0, IP_HDRINCL, &tmp, sizeof(tmp)) < 0) {
perror("setsockopt");
exit(1);
}
int local_tcp_port;
if ((local_tcp_port = allocate_raw_tcp_port(saddr)) < 0)
exit(1);
fprintf(stderr, "allocate_raw_tcp_port returned port %d\n", local_tcp_port);
struct iphdr *iphdr = (struct iphdr *) ip_buf;
bzero(iphdr, sizeof(struct iphdr));
iphdr->version = 4;
iphdr->ihl = 5;
iphdr->tos = 0;
iphdr->tot_len = htons(pkt_len);
iphdr->id = htons(5);
iphdr->frag_off = 0;
iphdr->ttl = 100;
iphdr->protocol = IPPROTO_TCP;
iphdr->saddr = saddr;
iphdr->daddr = daddr;
char *l4_buf = &ip_buf[sizeof(struct iphdr)];
struct tcphdr *tcphdr = (struct tcphdr *) l4_buf;
bzero(tcphdr, sizeof(struct tcphdr));
tcphdr->source = htons(local_tcp_port);
tcphdr->dest = htons(REMOTE_TCP_PORT);
tcphdr->seq = htonl(10);
tcphdr->ack_seq = 0;
tcphdr->doff = 5;
tcphdr->window = 5712;
tcphdr->urg_ptr = 0;
tcphdr->syn = 1;
while(1) {
if (send_raw(rawsock, ip_buf))
break;
if (errno != EPERM) {
fprintf(stderr, "dropping packet\n");
return;
}
sleep(1);
}
fprintf(stderr, "sent packet\n");
tcphdr->syn = 0;
tcphdr->ack = 1;
while(1) {
if (send_raw(rawsock, ip_buf))
break;
if (errno != EPERM) {
fprintf(stderr, "dropping packet\n");
return;
}
sleep(1);
}
fprintf(stderr, "sent packet\n");
}
More information about the Users
mailing list