[netfilter-core] Mangle table rules are not taken into account in preliminary routing decision

Discussion:

Patrick McHardy

2007-10-11 04:10:23 UTC

Netfilter team,
we use netfilter under linux kernel 2.4.31 and have the problem
described below.
!Note, that it can be easily reproduced for latest kernels.
- we want to connect to some on tcp port #80
- in the kernel and in some time we get to ip_route_output_slow function
if (fib_lookup(&key, &res)) {
res.fi = NULL;
if (oldkey->oif) { <- oif is zero at this
point, so we miss that "if"
......
}
...
err = -ENETUNREACH;
goto out;
}
It fails to find one as we don't have fwmark set for the packet and
there is not route for packets without fwmark (see configuration
attached). So, ENETUNREACH is returned and the packet fails to be sent.
In fact the packet could be routed
correctly, but this would happen in ip_build_xmit function in netfilter
hook for LOCAL_OUT packets.
- is it a bug or it's a deliberate decision to have such behaviour?
- is there any known add-hock solution for the problem?

Its a consequence of how routing by fwmark works. Its not perfect,
but I don't see a better solution since the initial routing takes
place before we even have a packet.

Just add a route to the dummy device or something like that, that
should make sure you don't get ENETUNREACH.
-
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Konstantin Ushakov

2007-10-11 06:47:45 UTC

Permalink

Post by Patrick McHardy

I'm afraid that dummy route does not solve the problem. I mean
- we should not pass out the packets, so where should the route lead?
To loopback?
- another thing is that on 'send' (for, say, some external address,
port 239)
with dummy route we hang, but if in fact the packet can't be routed,
we should get ENETUNREACH.

Idea that we had is the following:

we mark all packets that have passed netfilter (mangle table) with a
specific mark (see configuration below).
We add 2 rules:
- unreachable, for packets that have passed mange table but should not
be routed
- rule that lookup table #100 for all packets, in table #100 we have
route like
ip route add default via 127.0.0.2 table 100

Local traffic that goes to tcp port 80 is routed correctly. Forwarded
traffic is not routed,
ENETUNREACH is received on the lan side. BUT for local traffic that
should not be forwarded,
we don't receive UNREACH, 'send' just hangs.

Example:

on host on LAN side of the router:
bash$ nc 192.168.1.5 81
(UNKNOWN) [192.168.1.5] 80 (www) : No route to host

BUT if we issue that same command on the router itself, it handgs.

That's the situation, so two questions:
- is there any other common way, except for dummy route... or what
should the route look, so
it does not change behaviour of applications (see comments above).
- what is wrong with our new idea? I mean it's looks like a bug in
the kernel, but I don't
understand exactly where it is.

=================================================
Configuration:

# iptables -L -nv -t mangle
Chain PREROUTING (policy ACCEPT 715 packets, 80324 bytes)
pkts bytes target prot opt in out source
destination
707 78919 MARK all -- * * 0.0.0.0/0
0.0.0.0/0 MARK set 0xb

Chain INPUT (policy ACCEPT 700 packets, 78677 bytes)
pkts bytes target prot opt in out source
destination

Chain FORWARD (policy ACCEPT 0 packets, 0 bytes)
pkts bytes target prot opt in out source
destination

Chain OUTPUT (policy ACCEPT 118 packets, 64650 bytes)
pkts bytes target prot opt in out source
destination
0 0 MARK all -- * * 0.0.0.0/0
0.0.0.0/0 MARK set 0xb
0 0 MARK tcp -- * * 0.0.0.0/0
0.0.0.0/0 tcp dpt:80 MARK set 0xa

Chain POSTROUTING (policy ACCEPT 52 packets, 4796 bytes)
pkts bytes target prot opt in out source
destination

Rules:
0: from all lookup local
32766: from all lookup main
32767: from all lookup default
32768: from all fwmark 0xa lookup 10
40000: from all fwmark 0xb lookup 99 unreachable
50000: from all lookup 100

Thanks once again,
Konstantin.
-
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Patrick McHardy

2007-10-11 07:21:35 UTC

Permalink

Post by Konstantin Ushakov

Post by Patrick McHardy

- is it a bug or it's a deliberate decision to have such behaviour?
- is there any known add-hock solution for the problem?

I'm afraid that dummy route does not solve the problem. I mean
- we should not pass out the packets, so where should the route lead?
To loopback?

As I said, to the dummy device.

Post by Konstantin Ushakov
- another thing is that on 'send' (for, say, some external address,
port 239)
with dummy route we hang, but if in fact the packet can't be routed,
we should get ENETUNREACH.
[...]
we mark all packets that have passed netfilter (mangle table) with a
specific mark (see configuration below).
- unreachable, for packets that have passed mange table but should not
be routed
- rule that lookup table #100 for all packets, in table #100 we have
route like
ip route add default via 127.0.0.2 table 100
Local traffic that goes to tcp port 80 is routed correctly. Forwarded
traffic is not routed,
ENETUNREACH is received on the lan side. BUT for local traffic that
should not be forwarded,
we don't receive UNREACH, 'send' just hangs.
bash$ nc 192.168.1.5 81
(UNKNOWN) [192.168.1.5] 80 (www) : No route to host
BUT if we issue that same command on the router itself, it handgs.

Ah, I see the problem. The route returns unreachable, which
iptable_mangle translates to NF_DROP. The problem is that
netfilter itself can't return ENETUNREACH and there is no
valid output function attached to the dst_entry that would
send an icmp unreachable. I think the only thing you could
do is manually call icmp_send(ICMP_DEST_UNREACH) in
ip_route_me_harder for this case.

-
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Pascal Hambourg

2007-10-11 09:13:42 UTC

Permalink

Hello,

Post by Patrick McHardy
=20
Ah, I see the problem. The route returns unreachable, which
iptable_mangle translates to NF_DROP. The problem is that
netfilter itself can't return ENETUNREACH and there is no
valid output function attached to the dst_entry that would
send an icmp unreachable. I think the only thing you could
do is manually call icmp_send(ICMP_DEST_UNREACH) in
ip_route_me_harder for this case.

What about the REJECT target ?
-
To unsubscribe from this list: send the line "unsubscribe netfilter-dev=
el" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Konstantin Ushakov

2007-10-15 14:11:41 UTC

Permalink

Post by Pascal Hambourg
Hello,

Post by Patrick McHardy
Ah, I see the problem. The route returns unreachable, which
iptable_mangle translates to NF_DROP. The problem is that
netfilter itself can't return ENETUNREACH and there is no
valid output function attached to the dst_entry that would
send an icmp unreachable. I think the only thing you could
do is manually call icmp_send(ICMP_DEST_UNREACH) in
ip_route_me_harder for this case.

What about the REJECT target ?

Correct me if I'm mistaken, but REJECT target is only valid in filter
table. But the
packet does not reach filter table because of reasons described by
Patric (as we DROP
it after mangle). It is clearly observed by me when I insert LOG into
filter table.
-
To unsubscribe from this list: send the line "unsubscribe netfilter-dev=
el" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Pascal Hambourg

2007-10-15 15:01:17 UTC

Permalink

Post by Konstantin Ushakov

Post by Pascal Hambourg
What about the REJECT target ?

=20
Correct me if I'm mistaken, but REJECT target is only valid in filter
table.

Correct.

Post by Konstantin Ushakov
But the packet does not reach filter table because of reasons
described by Patric (as we DROP it after mangle).

Im meant to use the REJECT target /instead of/ an "unreachable" routing=
=20
rule.

Remove
ip rule add from all fwmark 0xb lookup 99 unreachable prio 40000

And add
iptables -t filter -A OUTPUT -m mark --mark 0xb \
-j REJECT --rejected-with icmp-net-unreachable
-
To unsubscribe from this list: send the line "unsubscribe netfilter-dev=
el" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html